Re: how-to query an xml repository efficiently

2009-09-08 Thread Jeroen Reijn
Hi Robby, do you perhaps have any more specs on what kind of XML database it is? At our company we have experience with an Apache Slide backed database, which we used for storing XML files and let Slide indexed them with Lucene. Then based on DASL queries we could search the repository really

Re: how-to query an xml repository efficiently

2009-09-08 Thread Mark Diggory
eXist has been (IMO) a more productive project that already utilizes Cocoon for presentation. Its origins were in the same db:xml codebase that Xindice was based on, but it has (again IMO) a richer more active developer community around it. I've contributed and used it in the past, the developers

Re: how-to query an xml repository efficiently

2009-09-08 Thread DAVIGNON Andre - CETE NP/DIODé/PANDOC
Hi, eXist is, as you say, a very productive project with an active community. That's why we like it (and use it) ! I proposed Solr because it seems that the point is just to query the XML, not to store it. If the point is to xquery XML data and store it, eXist is _the_ solution, as far as

RE: how-to query an xml repository efficiently

2009-09-08 Thread Robby Pelssers
Hi jeroen and others who replied to my mail... Let me further explain my usecase and existing infrastructure. My customer stores their product data in xml-files on file system E.g. ${repofolder}/ products/ product-1/ product-1.xml

Re: how-to query an xml repository efficiently

2009-09-08 Thread DAVIGNON Andre - CETE NP/DIODé/PANDOC
So for certain "Filter"-criteria I'll have to get all possible values so they can pick one and for others I don't need to know anything about the actual data. You can choose the data (XML properties) you want to index and the ones you don't want to. The actual product xml-files are +- 50

Re: how-to query an xml repository efficiently

2009-09-08 Thread Jeroen Reijn
Hi Robby, in this case I even think SOLR would be a great match for this use case. You can push XML with a http client to SOLR and let SOLR index the information. See the post.jar that comes with the SOLR example. It pushes XML to the solr app and indexes it based on your configuration. The

RE: how-to query an xml repository efficiently

2009-09-08 Thread Robby Pelssers
You all convinced me to investigate the SOLR path further ;-) I already installed SOLR yesterday but I probably did not spent enough time on playing with it due to lack of time. That's why I ask the experts on this mailing list ;-) David's answer "The facet research funtionality in Solr can giv

Re: Send a pdf file to a printer

2009-09-08 Thread Jan Grathwohl
What do you mean by "local printer"? Should the server print the document to a printer in the local office network where the application is running, or should the computer running the client (browser) print the file to its own printer, only without displaying it first? If it's the first,

Re: how-to query an xml repository efficiently

2009-09-08 Thread David Legg
Hi Robby, It sounds to me from your description that what you need is a common or garden CMS (Content Management System) based on a JCR (Java Content Repository) like Apache Jackrabbit [1] ). There are a number of CMS projects built on top of this platform (Hippo CMS7 [2] for example). Don'

RE: how-to query an xml repository efficiently

2009-09-08 Thread Robby Pelssers
By the way... one more question. A real life snippet of such an xml file looks like this: 1423004 mounting method type Non-quantitative Property S surface mount The user wants to search on characteristic "mounting meth

Re: how-to query an xml repository efficiently

2009-09-08 Thread Jeroen Reijn
Hi Robby, yes you can with the forrest components [1] Example: value="{properties:solr.update.url}"/> Regards, Jeroen Robby Pelssers wrote: By the way... one more question. A real life snippet of such an xml file looks like this: 1423004 mounting method type

Re: how-to query an xml repository efficiently

2009-09-08 Thread DAVIGNON Andre - CETE NP/DIODé/PANDOC
Robby, One more thing about this subject. You can do all that stuff directly with Cocoon / Lucene with java code only, but Solr offers rich possibilities of index configuration by schema.xml and index can be handled with a HTTP client inside Cocoon through the Solr XML / HTTP API. Or in java

Re: how-to query an xml repository efficiently

2009-09-08 Thread Mark Diggory
I utilize Solr as well. I would exemplify the differences between using Solr+Cocoon and eXist+Cocoon as the following Solr: good for term indexing large amounts of content while not retaining the structural nature of that XML content or necessarily having to store it. eXist: good for storing lar

Re: how-to query an xml repository efficiently

2009-09-08 Thread andre . davignon
We do the same way here. Thanks for this excellent post and all theses explanations and use cases. André -Original Message- From: Mark Diggory Date: Tue, 8 Sep 2009 09:03:04 To: Subject: Re: how-to query an xml repository efficiently I utilize Solr as well. I would exemplify the d

Re: using Cocoon for on-the-fly minification?

2009-09-08 Thread Kamal Bhatt
Hi. What you are talking about is a reader. I created a cocoon 2.2 reader which minified JS scripts (using jsmin) but the view on the dev mailing list was debate if there was a real need for this and so I abandoned the project [1]. I agreed with Reinhard oin the end. The basic argument agai

RE: how-to query an xml repository efficiently

2009-09-08 Thread Klein Ikkink, Hubert
Hi Robby, > The actual product xml-files are +- 500kb on average and I'm talking > about LOTS of products so I have to consider performance upfront. Just create the XML documents with all information you want indexed and post them to the Solr instance. If you keep a reference to the original file

RE: how-to query an xml repository efficiently

2009-09-08 Thread Klein Ikkink, Hubert
Hi Robby, > So in the end I do NOT want to index the original product-xml but an xml > file generated by cocoon which only extracts the searchable data and > transforms it into an easy searchable format. > Can I post files to SOLR based on a URL (cocoon pipeline)?? For a client we created a C