Introduction
I love LaTeX, which I have used for serious documents (which I define as documents with more than ten pages). It is unparalleled in typography, typesetting and reference and document management. It is almost indispensable if writing mathematical formulas. In addition, it uses plain text files for its documents. The thing LaTeX is not, it that is is not WYSIWIG.
On top of that, I rarely need the full power of LaTeX, but only the ability to convert simple texts into LaTeX without having to set up the entire infrastructure for such a document which contains a lot of boilerplate declarations at the beginning.
For simple texts such as blog posts, Markdown is entirely sufficient. But, of course, the original Markdown is geared towards creating HTML pages. This bias towards HTML becomes obvious as HTML can be added at any point to express additional features that Markdown does not possess.
The simplicity of Markdown, which adds to its attractiveness, also leads to a plethora of versions with additional features and modifications to accommodate items such as tables, references, figure captions and many other things not included in the original specification.
For most documents I want to write, I need only a minimal subset of LaTeX, and the possibilities Markdown offers are more than sufficient. In the end, I have decided to write most of my documents in a variant of Markdown called Pandoc’s Markdown, which is the version of Markdown as defined by the document converter Pandoc. It enriches Markdown with various possibilities, such as tables, captions, references, mathematical formulas. A significant advantage is that it allows the conversion from Markdown to both LaTeX and HTML and can be included in a rich environment to create blogs, articles, books, and E-pubs from one source document. For example, the engine behind this blog uses Pandoc to convert from Markdown into blog posts.
Including Graphics
Almost any document, be it a web page, an article, or a paper, will include graphics. Graphics in Markdown are converted to HTML with standard image <img>
tags. And of course, there is no need to store these images locally, and they can exist anywhere on the internet.
LaTeX is different. It expects that the image file is local and can be accessed via the file systemIt turns out that ConText which is based just like LaTeX on TeX has a command to include files from a remote URL.
. So to include an image or graphic, the \includegraphics
command takes a file path and includes additional parameters to format and scale the picture. A typical section would be to do
\begin{center}
\includegraphics[width=0.5
\textwidth]{1920px-Coccinella_magnifica01.jpg}
\end{center}
to include a centred picture and then scaled to be 50% of the text width of the document.
What I wanted to achieve was that I could specify a URL to be included and then cached to generate a LaTeX document which will compile to a PDF. Unfortunately, none of the packages seemed to support that natively, so the solution is to download the image via curl
or wget
. LaTeX allows running shell commands, although this feature is turned off by default and is potentially a security risk. The -shell-escape
has to be passed to the latex
command-line argument to enable shell commands.
Once the shell-esacpe
is enabled, a command \write18
is available to execute arbitrary commands within the LaTeX file. By default, the command runs in the background, so while you download your file, LaTeX will happily continue typesetting your text. However, as the image has not yet been downloaded, this will result in an error. Prefixing the \write18
with \immediate
will force the command to succeed before continuing with typesetting your document, thus allowing LaTeX to find your downloaded image.
Finally, you probably do not want to download your image every time you are typesetting, but you want to cache the file and use it as usual if available.
Putting all this together will allow redefining \includegraphics
to pass a URL, checking if the file already exists, downloading it if not, and including the downloaded image in the typeset PDF.
The resulting macro will be a drop-in replacement for \includegraphics
. It assumes that the last part of the URL of the image will be the filename stored on the file system. So http://example.com/someimage.jpg
will work, whereas http:\\example.com/someimage/
will probably not.
\renewcommand{\includegraphics}[2][]{%
\DeclareRobustCommand{\filename}[1]{%
\begingroup
\filename@parse{#1}%
\edef\filename@base{\detokenize\expandafter{\filename@base}}%
\texttt{\filename@base.\filename@ext}%
\endgroup
}\filename@parse{#2}%
\edef\filename@base{\detokenize\expandafter{\filename@base}}%
\typeout{\filename@base.\filename@ext}
\IfFileExists{\filename@base.\filename@ext}{}{\immediate\write18{wget #2}}
\latexincludegraphics[#1]{\filename@base.\filename@ext}%
}
\documentclass{article}
\usepackage{graphicx}
\let\latexincludegraphics\includegraphics
\makeatletter
\renewcommand{\includegraphics}[2][]{%
\DeclareRobustCommand{\filename}[1]{%
\begingroup
\filename@parse{#1}%
\edef\filename@base{\detokenize\expandafter{\filename@base}}%
\texttt{\filename@base.\filename@ext}%
\endgroup
}\filename@parse{#2}%
\edef\filename@base{\detokenize\expandafter{\filename@base}}%
\typeout{\filename@base.\filename@ext}
\IfFileExists{\filename@base.\filename@ext}{}{\immediate\write18{wget #2}}
\latexincludegraphics[#1]{\filename@base.\filename@ext}%
}\makeatother
\begin{document}
\section{Ladybirds}
Ladybirds are a widespread family of small beetles ranging in size from 0.8 to 18 mm
(0.03 to 0.71 in). The family is commonly known as ladybugs in North America and
ladybirds in Great Britain and other parts of the English-speaking world.
\begin{center}
\includegraphics[width=0.5\textwidth]{%
https://upload.wikimedia.org/wikipedia/commons/thumb/f/f2/Coccinella_magnifica01.jpg/1920px-Coccinella_magnifica01.jpg}
\end{center}
\end{document}
The source can be found on github in a gist
References
Coming up with this macro would only have been possible due to a number of questions asked in TeX Stackexchange, in particular: