), this paragraph's
# start and end tags will be removed from the result (keeping only its
# content). Avoid using this. p_unwrap should only be used internally by
# POD itself.
# When a style must be generated in the pod result, it can be stored
# either in content.xml or in styles.xml. In the vast majority of cases,
# choosing this "store" is automatically done by POD: you don't have to
# worry about it. That being said, if you want to bypass the POD logic
# and force the store, you can do it by passing a dict in p_stylesStore.
# Every key must be a HTML tag name; every value is, either "content"
# (store the style in content.xml) or "styles_base" (store it in
# styles.xml). If p_stylesStore is None or if the tag corresponding to
# the current style to generate is not among its keys, the base POD
# logic will apply.
stylesMapping = self.stylesManager.checkStylesMapping(stylesMapping)
if html is None: html = self.html
return Xhtml2OdtConverter(s, self.stylesManager, stylesMapping,
keepWithNext, keepImagesRatio, imagesMaxWidth, imagesMaxHeight,
self, html, inject, unwrap, stylesStore).run()
def renderText(self, s, prefix=None, tags=None, firstCss=None,
otherCss=None, lastCss=None, stylesMapping={}):
'''Renders the pure text p_s in the ODF result. For every carriage
return character present in p_s, a new paragraph is created in the
ODF result.'''
# p_prefix, if passed, must be a string that will be inserted before
# p_s's first line, followed by a tab.
# p_tags can be a sequence of XHTML single-letter tags (ie, "bu" for
# "bold" + "underline"). If passed, formatting as defined by these
# letters will be applied to the whole p_s (the p_prefix excepted).
# You may go further by applying CSS classes to the paragraphs produced
# by p_s's conversion to ODF. For every such class, if a style having
# the same name is found on the ODF template, it will be applied to the
# corresponding paragraph. The CSS class:
#- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# defined in | will be
# attribute ... | applied to ...
#- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# p_firstCss | the ODF result's first paragraph;
# p_lastCss | the ODF result's last paragraph;
# p_otherCss | any other paragraph in between.
#- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Moreover, because, internally, p_s is converted to a chunk of XHTML
# code, and passed to method m_renderXhtml, a p_stylesMapping can be
# passed to m_renderText, exactly as you would do for m_renderXhtml.
#
# Determine the prefix to insert before the first line
prefix = f'{prefix}' if prefix else ''
# Split p_s into lines of text
r = s.split('\n')
if tags:
# Prepare the opening and closing sub-tags
opening = ''.join([f'<{tag}>' for tag in tags])
closing = ''
j = len(tags) - 1
while j >= 0:
closing = f'{closing}{tags[j]}>'
j -= 1
else:
opening = closing = ''
i = length = len(r) - 1
pre = ''
while i >= 0:
# Get the CSS class to apply
if i == 0:
css = firstCss
pre = prefix
elif i == length:
css = lastCss
else:
css = otherCss
css = f' class="{css}"' if css else ''
r[i] = f'{pre}{opening}{Escape.xhtml(r[i])}{closing}
'
i -= 1
return self.renderXhtml(''.join(r), stylesMapping=stylesMapping)
# Supported image formats. "image" represents any format
imageFormats = 'png', 'jpeg', 'jpg', 'gif', 'svg', 'image'
ooFormats = 'odt',
convertibleFormats = FILE_TYPES.keys()
# Constant indicating if a renderer must inherit a value from is parent
# renderer.
INHERIT = 1
def importDocument(self, content=None, at=None, format=None,
anchor='as-char', wrapInPara=True, size=None, sizeUnit='cm',
maxWidth='page', maxHeight='page', style=None, keepRatio=True,
pageBreakBefore=False, pageBreakAfter=False, convertOptions=None):
'''Implements the POD statement "do... from document"'''
# If p_at is not None, it represents a path or url allowing to find the
# document. It can be a string or a pathlib.Path instance. If p_at is
# None, the content of the document is supposed to be in binary format
# in p_content. The document p_format may be: odt or any format in
# imageFormats.
# p_anchor, p_wrapInPara, p_size, p_sizeUnit, p_style and p_keepRatio
# are only relevant for images:
# * p_anchor defines the way the image is anchored into the document;
# valid values are 'page', 'paragraph', 'char' and 'as-char', but
# apply only in ODT templates. In an ODS template, implicit anchor
# will be "to cell" and p_anchor will be ignored. In an ODS
# template, anchoring "to the page" is currently not supported ;
# * p_wrapInPara, if true, wraps the resulting 'image' tag into a 'p'
# tag. Do NOT use this when importing an image into an ODS result
# via this method. Here is an example of image import ito an ODS
# file that will work:
# do cell
# from+ document(at='/some/path/a.png', wrapInPara=False)
# * p_size, if specified, is a tuple of float or integers (width,
# height) expressing size in p_sizeUnit (see below). If not
# specified, size will be computed from image info ;
# * p_sizeUnit is the unit for p_size elements, it can be "cm"
# (centimeters), "px" (pixels) or "pc" (percentage). Percentages,
# in p_size, must be expressed as integers from 1 to 100 ;
# * if p_maxWidth (p_maxHeight) is specified (as a float value
# representing cm), image's width (height) will not be greater
# than it. If p_maxWidth (p_maxHeight) is "page" (the default),
# p_maxWidth (p_maxHeight) will be computed as the width of the
# main document's page style, margins excluded ;
# * if p_style is given, it is a appy.shared.css.CssStyles instance,
# containing CSS attributes. If "width" and "heigth" attributes
# are found there, they will override p_size and p_sizeUnit ;
# * If p_keepRatio is True, the image width/height ratio will be kept
# when p_size is specified.
# p_pageBreakBefore and p_pageBreakAfter are only relevant for importing
# external odt documents, and allows to insert a page break before/after
# the inserted document. More precisely, each of these parameters can
# have values:
#- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# True | insert a page break ;
#- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# False | do no insert a page break.
#- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Moreover, p_pageBreakAfter can have those additional values :
#- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# 'duplex' | insert 2 page breaks if the sub-document has an odd
# | number of pages, 1 else (useful for duplex printing) ;
#- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# '' | inserts a page break mentioning the name of a specific
# | page style to apply to the remaining of the document.
# | It can be an alternative to using 'duplex', if the
# | page name forces LibreOffice to insert a virtual page,
# | like, ie, 'Right_20_Page'.
# | ⚠️ The referred page style must be used within the
# | main pod template: it must be applied to one of its
# | paragraphs. Else, LibreOffice will not include it
# | in the pod result. To ensure it is the case, you
# | may define an empty paragraph, with a pod statement
# | "do text if False", that will prevent him being
# | rendered in the pod result but will ensure its
# | related styles will be part of it. Then, for this
# | paragraph, go to "Paragraph > Text flow" tab,
# | define a break of type "Page", position "Before",
# | with a page style you choose in the list.
#- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# If p_convertOptions are given (for images only), imagemagick will be
# called with these options to perform some transformation on the image.
# For example, if you specify
# convertOptions="-rotate 90"
# pod will perform this command before importing the file into the
# result:
# convert your.image -rotate 90 your.image
# You can also specify a function in convertOptions. This function will
# receive a single arg, "image", an instance of
# appy.pod.doc_importers.Image giving some characteristics of the image
# to convert, like image.width and image.height in pixels (integers). If
# your function does not return a string containing the convert options,
# no conversion will occur.
# Is there something to import ?
if not content and not at: raise PodError(DOC_KO)
# Convert p_at to a string if it is not the case
at = str(at) if isinstance(at, Path) else at
# Guess document format
if not format:
# It should be deduced from p_at
if not at:
raise PodError(DOC_FMT_KO)
format = os.path.splitext(at)[1][1:]
else:
# If format is a mimeType, convert it to an extension
if format in utils.mimeTypesExts:
format = utils.mimeTypesExts[format]
format = format.lower()
isImage = isOdt = False
importer = None
if format in self.ooFormats:
importer = importers.OdtImporter
self.forceOoCall = True
isOdt = True
elif format in self.imageFormats or not format:
# If the format can't be guessed, we suppose it is an image
importer = importers.ImageImporter
isImage = True
elif format == 'pdf':
importer = importers.PdfImporter
elif format in self.convertibleFormats:
importer = importers.ConvertImporter
else:
raise PodError(DOC_FMT_NS % format)
imp = importer(content, at, format, self)
# Initialise image-specific parameters
if isImage:
imp.init(anchor, wrapInPara, size, sizeUnit, maxWidth, maxHeight,
style, keepRatio, convertOptions)
elif isOdt: imp.init(pageBreakBefore, pageBreakAfter)
return imp.run()
def importImage(self, content=None, at=None, format=None, size=None,
sizeUnit='cm', maxWidth='page', maxHeight='page',
style=None, keepRatio=True, convertOptions=None):
'''While m_importDocument allows to import a document or image via a
"do ... from document" statement, method m_importImage allows to
import an image anchored as a char via a POD expression
":image(...)", having almost the same parameters.'''
# Compared to the "document" function, and due to the specific nature of
# the "image" function, 2 parameters are missing:
# (1) "anchor" is forced to be "as-char";
# (2) "wrapInPara" is forced to False.
return self.importDocument(content=content, at=at, format=format,
wrapInPara=False, size=size, sizeUnit=sizeUnit, maxWidth=maxWidth,
maxHeight=maxHeight, style=style, keepRatio=keepRatio,
convertOptions=convertOptions)
def getResolvedNamespaces(self):
'''Gets a context where mainly used namespaces have been resolved'''
env = self.stylesParser.env
return {'text': env.ns(env.NS_TEXT), 'style': env.ns(env.NS_STYLE)}
def importPod(self, content=None, at=None, format='odt', context=None,
pageBreakBefore=False, pageBreakAfter=False,
managePageStyles='rename', resolveFields=False,
forceOoCall=False):
'''Implements the POD statement "do... from pod"'''
# Similar to m_importDocument, but allows to import the result of
# executing the POD template whose absolute path is specified in p_at
# (or, but deprecated, whose binary content is passed in p_content, with
# this p_format) and include it in the POD result.
# Renaming page styles for the sub-POD (p_managePageStyles being
# "rename") ensures there is no name clash between page styles (and tied
# elements such as headers and footers) coming from several sub-PODs or
# with styles defined at the master document level. This takes some
# processing, so you can set it to None if you are sure you do not need
# it.
# p_resolveFields has the same meaning as the homonym parameter on the
# Renderer.
# By default, if p_forceOoCall is True for p_self, a sub-renderer ran by
# a c_PodImporter will inherit from this attribute, excepted if
# parameter p_forceOoCall is different from INHERIT.
# ~
# Is there a pod template defined ?
if not content and not at: raise PodError(DOC_KO)
imp = importers.PodImporter(content, at, format, self)
# Importing a sub-pod into a main pod is done by defining, in the main
# pod, a section linked to an external file being the sub-pod result.
# Then, LibreOffice is asked to "resolve" such sections = replacing the
# section with the content of the linked file. This task is done by
# LibreOffice itself and triggered via a UNO command. Consequently,
# Renderer attribute "forceOoCall" must be forced to True in that case.
self.forceOoCall = True
# Define the context to use: either the current context of the current
# POD renderer, or p_context if given.
context = context or self.contentParser.env.context
imp.init(context, pageBreakBefore, pageBreakAfter, managePageStyles,
resolveFields, forceOoCall)
return imp.run()
def importCell(self, content, style='Default'):
'''Creates a chunk of ODF code ready to be dumped as a table cell
containing this p_content and styled with this p_style.'''
return f'' \
f'{content}'
def drawShape(self, name, type='rect', stroke='none', strokeWidth='0cm',
strokeColor='#666666', fill='solid', fillColor='#666666',
anchor='char', width='1.0cm', height='1.0cm',
deltaX='0.0cm', deltaY='0.0cm', target='styles'):
'''Renders a shape within a POD result'''
# Generate a style for this shape and add it among dynamic styles
style = self.stylesManager.stylesGenerator.addGraphicalStyle(target,
stroke=stroke, strokeWidth=strokeWidth, strokeColor=strokeColor,
fill=fill, fillColor=fillColor)
# Return the ODF code representing the shape
return f'' \
f''
def _insertBreak(self, type):
'''Inserts a page or column break into the result'''
name = 'podPageBreakAfter' if type == 'page' else 'podColumnBreak'
return f''
def insertPageBreak(self): return self._insertBreak('page')
def insertColumnBreak(self): return self._insertBreak('column')
def prepareFolders(self):
'''Ensure p_self.result is correct and create, when relevant, the temp
folder for preparing it.'''
# Ensure p_self.result is an absolute path
self.result = os.path.abspath(self.result)
# Raise an error if the result already exists and we can't overwrite it
if os.path.isdir(self.result):
raise PodError(RES_FOLDER % self.result)
exists = os.path.isfile(self.result)
if not self.overwriteExisting and exists:
raise PodError(RES_EXISTS % self.result)
# Remove the result if it exists
if exists: os.remove(self.result)
# Create a temp folder for storing temporary files
self.tempFolder = f'{self.result}.{time.time()}'
try:
os.mkdir(self.tempFolder)
except OSError as oe:
raise PodError(TEMP_W_KO % (self.result, oe))
# The "unzip" folder is a sub-folder, within self.tempFolder, where
# p_self.template will be unzipped.
self.unzipFolder = os.path.join(self.tempFolder, 'unzip')
os.mkdir(self.unzipFolder)
def patchMetadata(self):
'''Declares, in META-INF/manifest.xml, images or files included via the
"do... from document" statements if any, and patch meta.xml (field
"title").'''
# Patch META-INF/manifest.xml
j = os.path.join
if self.fileNames:
toInsert = ''
for fileName in self.fileNames.keys():
mimeType = mimetypes.guess_type(fileName)[0]
toInsert = f'{toInsert} {bn}'
manifestName = j(self.unzipFolder, 'META-INF', 'manifest.xml')
# Read the the content of this file, if not already in
# self.manifestXml.
if not self.manifestXml:
with open(manifestName, **enc) as f:
self.manifestXml = f.read()
hook = ''
content = self.manifestXml.replace(hook, toInsert + hook)
# Write the new manifest content
with open(manifestName, 'w', **enc) as f:
f.write(content)
# Patch meta.xml
metadata = self.metadata
if metadata:
metaName = j(self.unzipFolder, 'meta.xml')
# Read the content of this file, if not already in self.metaXml
if not self.metaXml:
with open(metaName, **enc) as f:
self.metaXml = f.read()
# Remove the existing title, if it exists
content = self.metaRex.sub('', self.metaXml)
# Add a new title, based on the result name
if isinstance(metadata, str):
title = metadata
else:
title = os.path.splitext(os.path.basename(self.result))[0]
hook = self.metaHook
title = f'{title}{hook}'
content = content.replace(hook, title)
with open(metaName, 'w', **enc) as f:
f.write(content)
# Public interface
def run(self):
'''Renders the result'''
try:
# Remember which parser is running
self.currentParser = self.contentParser
# Create the resulting content.xml
self.currentParser.parse(self.contentXml)
self.currentParser = self.stylesParser
# Create the resulting styles.xml
self.currentParser.parse(self.stylesXml)
# Patch metadata
self.patchMetadata()
# Re-zip the result
self.finalize()
finally:
try:
if self.deleteTempFolder:
FolderDeleter.delete(self.tempFolder)
except PermissionError:
# The temp folder could not be deleted
pass
def getStyles(self):
'''Returns a dict of the styles that are defined into the template'''
return self.stylesManager.styles
def setStylesMapping(self, stylesMapping):
'''Establishes a correspondence between, on one hand, CSS styles or
XHTML tags that will be found inside XHTML content given to POD,
and, on the other hand, ODT styles found into the template.'''
try:
manager = self.stylesManager
# Initialise the styles mapping when relevant
ocw = self.optimalColumnWidths
dis = self.distributeColumns
if ocw or dis:
sm.TableProperties.initStylesMapping(stylesMapping, ocw, dis)
manager.stylesMapping = manager.checkStylesMapping(stylesMapping)
except PodError as po:
self.contentParser.env.currentBuffer.content.close()
self.stylesParser.env.currentBuffer.content.close()
if os.path.exists(self.tempFolder):
FolderDeleter.delete(self.tempFolder)
raise po
def getTemplateType(self, template):
'''Identifies the type of this POD p_template (ods or odt). If
p_template is a string, it is a file name: simply get its extension.
Else, it is a binary file in a BytesIO instance: seek the MIME type
from the first bytes.'''
if isinstance(template, str):
r = os.path.splitext(template)[1][1:]
else:
# A BytesIO instance
template.seek(0)
sbytes = template.read(90)
sbytes = sbytes[sbytes.index(b'mimetype') + 8:]
odsMIME = utils.mimeTypes['ods'].encode()
r = 'ods' if sbytes.startswith(odsMIME) else 'odt'
return r
def callLibreOffice(self, resultName, format=None, outputName=None):
'''Call LibreOffice in server mode to convert or update the result'''
if self.loPool is None:
raise PodError(NO_LO_POOL % resultType)
return self.loPool(self, resultName, format, outputName)
def finalize(self):
'''Re-zip the result and potentially call LibreOffice if target format
is not among self.templateTypes or if forceOoCall is True.'''
j = os.path.join
# If an action regarding page styles must be performed, get and modify
# them accordingly.
pageStyles = None
mps = self.managePageStyles
if mps is not None:
pageStyles = self.stylesManager.pageStyles.init(mps, self.template)
# Patch styles.xml and content.xml
dynamic = self.stylesManager.dynamicStyles
for name in ('styles', 'content'):
# Copy the [content|styles].xml file from the temp to the zip folder
fn = f'{name}.xml'
shutil.copy(j(self.tempFolder, fn), j(self.unzipFolder, fn))
# Get the file content
fn = os.path.join(self.unzipFolder, fn)
with open(fn, **enc) as f:
content = f.read()
# Inject self.fonts, when present, in styles.xml
isStylesXml = name == 'styles'
if isStylesXml and self.fonts:
content = sm.FontsInjector(self.fonts).injectIn(content)
# Inject dynamic styles
if isStylesXml: content = dynamic.injectIn('styles_base', content)
content = dynamic.injectIn(name, content)
# Rename the page styles
if pageStyles:
content = pageStyles.renameIn(name, content)
# Patch pod graphics, when relevant
if not isStylesXml:
content = Graphic.patch(self, content)
# Write the updated content to the file
with open(fn, 'w', **enc) as f:
f.write(content)
# Call the user-defined "finalize" function(s) when present
if self.finalizeFunction:
try:
for fun in self.finalizeFunction: fun(self.unzipFolder, self)
except Exception as e:
print(WARN_FIN_KO % str(e))
# Re-zip the result, first as an OpenDocument file of the same type as
# the POD template (odt, ods...)
resultName = os.path.join(self.tempFolder,f'result.{self.templateType}')
resultType = self.resultType
zip(resultName, self.unzipFolder, odf=True)
if resultType in self.templateTypes and not self.forceOoCall:
# Simply move the ODT result to the result
os.rename(resultName, self.result)
else:
if resultType not in FILE_TYPES:
raise PodError(R_TYPE_KO % (self.result, FILE_TYPES.keys()))
# Call LibreOffice to perform the conversion or document update
output = self.callLibreOffice(resultName, outputName=self.result)
# I (should) have the result in self.result
if not os.path.exists(self.result):
if resultType in self.templateTypes:
# In this case LO in server mode could not be called (to
# update indexes, sections, etc) but I can still return the
# "raw" pod result that exists in "resultName".
os.rename(resultName, self.result)
else:
raise PodError(CONV_ERR % output)
#- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -