GiPSy Scanner

GiPSy is an extensible general purpose text scanner and parser. The base scanner is intended to be subclassed and extended to provide the desired functionality.

For instance, to implement a basic Python source code scanner, refer to Python source code parser (also linked below).

The steps to extend are:

And you're done! Create an instance of your new class, feed it the text to be parsed through the tokenize() method, and obtain the output from the read() method. The read() method takes three optional arguments:

By default, the read() method outputs the original, unmodified text.

View the py2html application for an example of using the Py2HTMLParser class in a real life setting.

You can also access the self._tlist attribute directly in your subclass to work with the actual list of tokens to provide additional functionality on top of decorated printing.

Source Code and Downloads

View the GiPSy scanner source code:

Python distutils distributions available from the Python Package Index:

Regular compressed archives available locally: