Wednesday, September 06, 2006

Google Helps HP With OCR, But Why?

This might shed some light on why I think Google will dominate physical world connection and the mobile marketing space.

From ZD Net Google helps HP with OCR engine

Google engineers apparently have in their work reviving an old indexing engine developed and left to rust by Hewlett-Packard.

The search giant announced that it's helped fix software bugs in the 2-decades-old Tesseract, an optical character recognition (OCR) engine originally built by HP Labs and retired in 1995 before the company released the code to the open-source community in recent months.

Why is Google interested in OCR? According to the company, which posted the news Thursday on its code page.

: "In a nutshell, we are all about making information available to users ( I would add on all mediums), and when this information is in a paper document, OCR is the process by which we can convert the pages of this document into text that can then be used for indexing."

Google claims that Tesseract OCR is "far more accurate than any other Open Source OCR package out there."

Maybe this is making more sense now.

See more of Google and OCR

1 comment:

Anonymous said...

My guess is that Google's interest in OCR mainly stems from its involvement in its book scanning project.