On 10/01/12 09:44, Ankit Verma wrote:
I want to create a simple poc : like a user simply uploads a document
in any format and it should be persisted and now if any other user
searches for the uploaded document then searching should be done and
user gets the exact match.
So it sounds to me that you have two primary requirements:

* a document search and retrieval system with upload

* the ability to augment normal search over the documents with a structured search across the embedded data or metadata that *some* of the documents will have.

If that's a fair summary, then I think you're going about it in the wrong order by starting with Jena and then building in document handling, especially if you're just trying to create a PoC. If it were me, I'd look at the many open source CMS solutions to provide the core of your document handling requirements, then look to how you could augment the workflow to extract the embedded data / metadata for some documents, and augment the search to use that data where it's available.

A good place to start might be Apache Stanbol:

http://incubator.apache.org/stanbol/index.html

Ian

Reply via email to