David Bradley ISSUE #40
September 2004

That INChI Feeling

Steve Heller  
Over the years, IUPAC names have become increasingly cumbersome. But, Dmitrii Tchekhovskoi, Steve Stein, and Steve Heller of the National Institute of Standards and Technology (NIST) have worked to develop a means for identifying chemical structures on a computer without having to work out a complex, standard nomenclature for each one. The RSC's Alan McNaught heads the IUPAC Task Group for the project, while Reactive Reports publisher ACD/Labs has incorporated the concept into their ACD/ChemSketch structure drawing program.The resulting IUPAC-NIST Chemical Identifier (INChI) could revolutionize chemical information retrieval, cheminformatics, and data mining.

   INChI in action on ChemSketch. Click image to magnify
 Screengrab of INChI in action on ChemSketch (Images by David Bradley)

The chemical structure of a compound is its only true identifier, according to the developers of INChI; however, structures are not unique or convenient for computers. The INChI team, building on Heller's original idea, has found an algorithm that converts any structure using its atom connection table into a unique string of characters.

The latest version of INChI now not only handles organic, covalent structures but also inorganic and organometallic compounds and is now available.

In practice, a user will simply draw a compound's structure in a package, such as ACD/ChemSketch, and the built-in INChI component will convert it to a unique identifier. The string could, in theory, be converted back to a structure. "We will all reap the benefits of a generally accepted computer convention for uniquely representing and communicating the identity of any chemical substance," McNaught told Reactive Reports.



Test programs (for Microsoft Windows), documentation and sample structure files are available for download at: http://chemdata.nist.gov/IChI/INChIv11b.zip