Quick Start¶

The below illustrates some basic TexSoup functions.

How to Use¶

Here is a \(\LaTeX\) document:

>>> tex_doc = """
... \begin{document}
... \section{Hello \textit{world}.}
... \subsection{Watermelon}
... (n.) A sacred fruit. Also known as:
... \begin{itemize}
...   \item red lemon
...   \item life
... \end{itemize}
... Here is the prevalence of each synonym, in Table \ref{table:synonyms}.
... \begin{tabular}{c c}\label{table:synonyms}
...   red lemon & uncommon \\ \n
...   life & common
... \end{tabular}
... \end{document}
... """

Call TexSoup on this string to re-represent this document as a nested data structure:

>>> from TexSoup import TexSoup
>>> soup = TexSoup(tex_doc)
>>> soup
\begin{document}
\section{Hello \textit{world}.}
\subsection{Watermelon}
(n.) A sacred fruit. Also known as:
\begin{itemize}
\item red lemon
\item life
\end{itemize}
Here is the prevalence of each synonym, in Table \ref{table:synonyms}.
\begin{tabular}{c c}\label{table:synonyms}
red lemon & uncommon \\ \n
life & common
\end{tabular}
\end{document}

Here are a few ways to navigate the TexSoup data structure:

>>> soup.section
\section{Hello \textit{world}.}
>>> soup.section.name
'section'
>>> soup.section.string
'Hello \\textit{world}.'
>>> soup.section.parent.name
'document'
>>> soup.tabular
\begin{tabular}{c c}\label{table:synonyms}
red lemon & uncommon \\ \n
life & common
\end{tabular}
>>> soup.tabular.args[0]
'c c'
>>> soup.item
\item red lemon

>>> list(soup.find_all('item'))
[\item red lemon
  , \item life
]

One task may be to find all references. To do this, simply search for \ref{<label>}. You can even report each reference’s line number:

>>> soup.count(r'\ref{table:synonyms}')
1
>>> for cmd in soup.find_all(r'\ref{table:synonyms}'):
...   soup.char_pos_to_line(cmd.position)
(8, 49)

Another task may be to extract all text from the page:

>>> list(soup.text)
['Hello ', 'world', '.', 'Watermelon', '\n\n(n.) A sacred fruit. Also known as:\n\n', 'red lemon\n', 'life\n', '\n\nHere is the prevalence of each synonym.\n\n', '\nred lemon & uncommon \\\\ ', '\nlife & common\n']

Does this look promising? If so, try TexSoup online or read on to install.

How to Install¶

TexSoup is published via PyPi, so you can install it via pip. The package name is TexSoup:

pip install TexSoup

Alternatively, you can install the package from source:

git clone https://github.com/alvinwan/TexSoup.git
cd TexSoup
python setup.py install