Re: Indexing gets significantly slower after every batch commit

2015-05-22 Thread Siegfried Goeschl
Hi Angel, a while ago I had issues with VMWare VM - somehow snapshots were created regularly which dragged down the machine. So I think is is a good idea to baseline the performance on physical box before moving to VMs, production boxes or whatever is thrown at you Cheers, Siegfried Goeschl

Re: New article on ZK Poison Packet

2015-05-10 Thread Siegfried Goeschl
Cool stuff - thanks for sharing Siegfried Goeschl On 09 May 2015, at 08:43, steve sc_shep...@hotmail.com wrote: While very technical and unusual, a very interesting view of the world of Linux and ZooKeeper Clusters... http://www.pagerduty.com/blog/the-discovery-of-apache-zookeepers-poison

Re: Indexing PDF and MS Office files

2015-04-16 Thread Siegfried Goeschl
at commons-exec :-) Cheers, Siegfried Goeschl PS: one more thing - please, tell your management that you will never ever successfully all real-world PDFs and cater for that fact in your requirements :-) On 16.04.15 13:10, Vijaya Narayana Reddy Bhoomi Reddy wrote: Erick, I tried indexing

Re: Measuring QPS

2015-04-06 Thread Siegfried Goeschl
this might be a better solution in the long run. But this requires to have a separate SOLR core ingest plus GUI (check out SILK or ELK) - in other words more moving parts in production :-) * If there is sufficient interest I can make a code drop on GitHub Cheers, Siegfried Goeschl On 06

Re: Measuring QPS

2015-04-06 Thread Siegfried Goeschl
, Siegfried Goeschl On 06 Apr 2015, at 20:04, Davis, Daniel (NIH/NLM) [C] daniel.da...@nih.gov wrote: Siegfried, It is early days as yet. I don't think we need a code drop. AFAIK, none of our current Solr applications autocomplete the search box based on popular query/title keywords

Re: Measuring QPS

2015-04-06 Thread Siegfried Goeschl
Appreciated :-) Siegfried Goeschl On 06 Apr 2015, at 20:31, Davis, Daniel (NIH/NLM) [C] daniel.da...@nih.gov wrote: OK, I have a lot of chutzpah posting that here ;)The other guys answering the questions can probably explain it better. I love showing off, however, so please

Re: Measuring QPS

2015-04-06 Thread Siegfried Goeschl
The good-sounding thing - you can do that easily with JMeter running the GUI or the command-line Cheers, Siegfried Goeschl On 06 Apr 2015, at 21:35, Davis, Daniel (NIH/NLM) [C] daniel.da...@nih.gov wrote: This sounds really good: For load testing, we replay production logs to test

Re: Measuring QPS

2015-04-06 Thread Siegfried Goeschl
* the XML processor uses a Stax parser to handle huge JTL files (exceeding 1 GB) * it also caters for merging JTL files when running multiple JMeter instances Cheers, Siegfried Goeschl On 06 Apr 2015, at 22:57, Walter Underwood wun...@wunderwood.org wrote: The load testing is the easiest

Re: Measuring QPS

2015-04-04 Thread Siegfried Goeschl
-to-production-20121210.pdf http://people.apache.org/~sgoeschl/presentations/jsug-2015/jee-performance-monitoring.pdf http://people.apache.org/~sgoeschl/presentations/jsug-2015/jee-performance-monitoring.pdf Cheers, Siegfried Goeschl On 03 Apr 2015, at 17:53, Shawn Heisey apa...@elyograg.org wrote

Re: Trending functionality in Solr

2015-02-09 Thread Siegfried Goeschl
If you are interested we could team up and make a proper SOLR contribution :-) Cheers, Siegfried Goeschl On 08.02.15 05:26, S.L wrote: Folks, Is there a way to implement the trending functionality using Solr , to give the results using a query for say the most searched terms in the past

Re: OutOfMemoryError for PDF document upload into Solr

2015-01-16 Thread Siegfried Goeschl
Hi Dan, neat idea - made a mental note :-) That brings us back to the point that in complex setups you should not do the document pre-processing directly in SOLR but have an import process which can safely crash when processing a 4GB PDF file Cheers, Siegfried Goeschl On 16.01.15 05:02

Re: OutOfMemoryError for PDF document upload into Solr

2015-01-15 Thread Siegfried Goeschl
Hi Ganesh, you can increase the heap size but parsing a 4 GB PDF document will very likely consume A LOT OF memory - I think you need to check if that large PDF can be parsed at all :-) Cheers, Siegfried Goeschl On 14.01.15 18:04, Michael Della Bitta wrote: Yep, you'll have to increase

Re: Slow queries

2014-12-08 Thread Siegfried Goeschl
expensive SOLR queries, what is your server code is doing - many questions and even more answers to that - in other words nobody can help you when the basic work is not done. And when you know your application performance-wise you probably also the solution :-) Cheers, Siegfried Goeschl

Re: Slow queries

2014-12-02 Thread Siegfried Goeschl
If you performance was fine but degraded over the time it might be easier to check / increase the memory to have better disk caching. Cheers, Siegfried Goeschl On 02.12.14 09:27, melb wrote: Hi, I have a solr collection with 16 millions documents and growing daily with 1 documents

Re: Slow queries

2014-12-02 Thread Siegfried Goeschl
for the next three years :-) Cheers, Siegfried Goeschl On 02 Dec 2014, at 10:02, melb melaggo...@gmail.com wrote: Yes performance degraded over the time, I can raise the memory but I can't do it every time and the volume will keep growing Is it better to put the solr on dedicated machine

Re: AW: AW: slorj - httpclient 4, but we already have httpclient 3 in use

2014-09-19 Thread Siegfried Goeschl
Lucky you :-) Siegfried Goeschl On 19.09.14 07:31, Clemens Wyss DEV wrote: I'd like to mention, that substituting the httpcore.jar with the latest (4.3) sufficed... -Ursprüngliche Nachricht- Von: Guido Medina [mailto:guido.med...@temetra.com] Gesendet: Donnerstag, 18. September 2014

Re: slorj - httpclient 4, but we already have httpclient 3 in use

2014-09-18 Thread Siegfried Goeschl
but not on the production server or might change to a change in the project Cheers, Siegfried Goeschl On 18.09.14 15:08, Clemens Wyss DEV wrote: I doing initial steps with solrj which is based on httpclient 4. Unfortunately parts of our framework are based on httpclient 3. So when I instantiate

Re: AW: slorj - httpclient 4, but we already have httpclient 3 in use

2014-09-18 Thread Siegfried Goeschl
AFAIK even the different minor versions are source/binary compatible so you might need to tinker with the right version to get your server running Cheers, Siegfried Goeschl On 18.09.14 17:45, Guido Medina wrote: Hi Clemens, If you are going thru the effort of migrating from SolrJ 3 to 4

Re: Mongo DB Users

2014-09-16 Thread Siegfried Goeschl
remove please On 16.09.14 15:42, Karolina Dobromiła Jeleń wrote: remove please On Tue, Sep 16, 2014 at 9:35 AM, Amey Patil amey.pa...@germin8.com wrote: Remove. On Tue, Sep 16, 2014 at 12:58 PM, Joan joan.monp...@gmail.com wrote: Remove please 2014-09-16 6:59 GMT+02:00 Patti Kelroe-Cooke

Re: external indexer for Solr Cloud

2014-09-01 Thread Siegfried Goeschl
the Camel Solr Integration http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/99739 Cheers, Siegfried Goeschl On 01.09.14 18:05, Jack Krupansky wrote: Packaging SolrCell in the same manner, with parallel threads and able to talk to multiple SolrCloud servers in parallel would have

Re: SOLR Performance benchmarking

2014-07-13 Thread Siegfried Goeschl
, optimise them, re-run your tests ** check your cache warming and how fast you start your load injector threads Cheers, Siegfried Goeschl On 13 Jul 2014, at 09:53, rashi gandhi gandhirash...@gmail.com wrote: Hi, I am using SolrMeter for load/stress testing solr performance. Tomcat

Re: SOLR: getting documents in the given order

2014-06-03 Thread Siegfried Goeschl
Assuming that you just want to sort - have you tried using sort=id desc Cheers, Siegfried Goeschl On 04 Jun 2014, at 06:19, sachinpkale sachinpk...@gmail.com wrote: I have a following field in SOLR schema. fieldType name=integer class=solr.IntField omitNorms=true/ field name=id type

iText hitting infinite loop - Was Re: pdfs

2014-06-02 Thread Siegfried Goeschl
by * Apache PDFBox 1.8.4 onwards * Apache Tika 1.5 * Apache SOLR 4.8 Cheers, Siegfried Goeschl On 26.05.14 18:20, Erick Erickson wrote: Brian: Yeah, if you can share the PDF that would be great. Parsing via Tika should not bring down Solr, although I supposed there could be something in Tika

Re: ExtractingRequestHandler indexing zip files

2014-05-27 Thread Siegfried Goeschl
Hi Sergio, your either do the stuff on the caller side (which is probably a good idea since you are off-load the SOLR server) or extend the ExtractingRequestHandler Cheers, Siegfried Goeschl On 27 May 2014, at 10:37, marotosg marot...@gmail.com wrote: Hi, Thanks for your answer Alexandre

Re: pdfs

2014-05-25 Thread Siegfried Goeschl
Hi Brian, can you send me the email? I would like to play around :-) Have you opened a JIRA for PdfBox? If not I willl open one if I can reproduce the issue … Thanks in advance Siegfried Goeschl On 25 May 2014, at 04:18, Brian McDowell brianmc...@gmail.com wrote: Our feeding (indexing

Re: pdfs

2014-05-25 Thread Siegfried Goeschl
Sorry typo :- can you send me the PDF by email directly :-) Siegfried Goeschl On 25 May 2014, at 10:06, Siegfried Goeschl sgoes...@gmx.at wrote: Hi Brian, can you send me the email? I would like to play around :-) Have you opened a JIRA for PdfBox? If not I willl open one if I can

Re: SolrCloud Nodes autoSoftCommit and (temporary) missing documents

2014-05-25 Thread Siegfried Goeschl
Hi folks, I think that the timestamp should be rounded down to a minute (or whatever) to avoid trashing the filter query cache Cheers, Siegfried Goeschl On 25 May 2014, at 18:19, Steve McKay st...@b.abbies.us wrote: Solr can add the filter for you: requestHandler ... lst name=appends

Re: pdfs

2014-05-22 Thread Siegfried Goeschl
the document extraction stuff out of SOLR * provide monitoring and recovery and stuck document extractions ** killing worker threads ** using external processed and kill them when spinning out of control Cheers, Siegfried Goeschl On 22.05.14 06:46, Jack Krupansky wrote: Yeah, PDF extraction has

Re: Indexing PDF in Apache Solr 4.8.0 - Problem.

2014-05-12 Thread Siegfried Goeschl
Hi Vignesh, can you check your SOLR Server Log?! Not all PDF documents on this planet can be processed using Tikka :-) Cheers, Siegfried Goeschl On 07 May 2014, at 09:40, vignesh vignes...@ninestars.in wrote: Dear Team, I am Vignesh using the latest version 4.8.0 Apache Solr

Re: Export big extract from Solr to [My]SQL

2014-05-02 Thread Siegfried Goeschl
Hi Per, basically I see three options * use a lot of memory to scope with huge result sets * user result set paging * SOLR 4.7 supports cursors (https://issues.apache.org/jira/browse/SOLR-5463) Cheers, Siegfried Goeschl On 02.05.14 13:32, Per Steffensen wrote: Hi I want to make extracts

Re: Having trouble with German compound words in Solr 4.7

2014-04-24 Thread Siegfried Goeschl
of the field so the following queries would work with the indexed word Leinenhose * leinenhosen * leinenhose * leinen hose * leinen hosen Cheers, Siegfried Goeschl On 22.04.14 12:13, Alistair wrote: I've managed to solve this (in a quite hacky sort of way) by using filter queries and the edismax

Re: Having trouble with German compound words in Solr 4.7

2014-04-18 Thread Siegfried Goeschl
is actually executed * one thing I always do for prototyping is setting up the Solritas GUI using the same query handler as the application server Cheers, Siegfried Goeschl On 18 Apr 2014, at 06:06, Alistair ali...@gmail.com wrote: Hey Jack, thanks for the reply. I added

Re: No route to host

2014-04-09 Thread Siegfried Goeschl
Hi folks, the URL looks wrong (misconfigured) http://host:8080/solr/collection1 Cheers, Siegfried Goeschl On 09 Apr 2014, at 14:28, Rallavagu rallav...@gmail.com wrote: All, I see the following error in the log file. The host that it is trying to find is itself. Wondering if anybody

Re: Anyone going to ApacheCon in Denver next week?

2014-04-06 Thread Siegfried Goeschl
Hi folks, I’m already here and would love to join :-) Cheers, Siegfried Goeschl On 05 Apr 2014, at 20:43, Doug Turnbull dturnb...@opensourceconnections.com wrote: I'll be there. I'd love to meet up. Let me know! Sent from my Windows Phone From: William Bell Sent: 4/5/2014 10:40 PM

Re: Apache Solr.

2014-02-03 Thread Siegfried Goeschl
Hi Vignesh, a few keywords for further investigations * Solr Data Import Handler * Apache Tikka * Apache PDFBox Cheers, Siegfried Goeschl On 03.02.14 09:15, vignesh wrote: Hi Team, I am Vignesh, am using Apache Solr 3.6 and able to Index XML file and now trying

Re: Why do people want to deploy to Tomcat?

2013-11-12 Thread Siegfried Goeschl
Hi ALex, in my case * ignorance that Tomcat is not fully supported * Tomcat configuration and operations know-how inhouse * could migrate to Jetty but need approved change request to do so Cheers, Siegfried Goeschl On 12.11.13 04:54, Alexandre Rafalovitch wrote: Hello, I keep seeing here

Re: how to debug my own analyzer in solr

2013-10-21 Thread Siegfried Goeschl
Thread Dump and/or Remote Debugging?! Cheers, Siegfried Goeschl On 21.10.13 11:58, Mingzhu Gao wrote: More information about this , the custom analyzer just implement createComponents of Analyzer. And my configure in schema.xml is just something like : fieldType name=text_cn class

Re: solr 4.4 config trouble

2013-09-30 Thread Siegfried Goeschl
Hi Marc, what exactly is not working - no obvious problemsin the logs as as I see Cheers, Siegfried Goeschl Am 30.09.2013 um 11:44 schrieb Marc des Garets m...@ttux.net: Hi, I'm running solr in tomcat. I am trying to upgrade to solr 4.4 but I can't get it to work. If someone can point

Re: how to suppress result

2008-04-07 Thread Siegfried Goeschl
Hi Evgeniy +) delete the documents if you really don't need need them +) create a field ignored and build an appropriate query to exclude the documents where 'ignored' is true Cheers, Siegfried Goeschl Evgeniy Strokin wrote: Hello,.. I have odd problem. I use Solr for regular search

Re: Can We append a field to the response that is not in the index but computed at runtime.

2008-03-31 Thread Siegfried Goeschl
) Cheers, Siegfried Goeschl Umar Shah wrote: On Mon, Mar 31, 2008 at 7:38 PM, Ryan McKinley [EMAIL PROTECTED] wrote: Two approaches: 1. make a map and add it to the response: rb.rsp.add( mystuff, mymap ); I tried using both Map/ NamedList it appends to the results I have to attach

Re: Combining SOLR and JAMon to monitor query execution times from a browser

2007-11-28 Thread Siegfried Goeschl
Hi Noberto, JAMon is all about aggregating statistical data and displaying the information for a web browser - the main beauty is that it is easy to define what you are monitoring such as querying domain objects per customer. Cheers, Siegfried Goeschl Norberto Meijome wrote: On Tue, 27

Combining SOLR and JAMon to monitor query execution times from a browser

2007-11-27 Thread Siegfried Goeschl
browser +) a small presentation can be found at http://people.apache.org/~sgoeschl/presentations/jamon-20070717.pdf +) if it is of general I can rewrite the code as contribution Cheers, Siegfried Goeschl

Re: Need question to configure Log4j for solr

2007-07-13 Thread Siegfried Goeschl
Hi Ken, and we stopped using Resin's support for daily rolling log files since it blocks the server for 20 minutes when rotating a 20 GB logfile - please don't ask what we are doing with the daily 20 GB ... :-( Cheers, Siegfried Goeschl Ken Krugler wrote: : the troubles comes when you

Re: Need question to configure Log4j for solr

2007-07-12 Thread Siegfried Goeschl
Hi folks, would be using commons-logging an improvement? It is a common requirement to hook up different logging infrastructure .. Cheers, Siegfried Goeschl Erik Hatcher wrote: On Jul 11, 2007, at 9:07 PM, solruser wrote: How do I configure solr to use log4j logging. I am able

Re: Need question to configure Log4j for solr

2007-07-12 Thread Siegfried Goeschl
Hi Erik, the troubles comes when you integrate third-party stuff depending on log4j (as I currently do). Having said this you have a strong point when looking at http://www.qos.ch/logging/classloader.jsp Cheers, Siegfried Goeschl Erik Hatcher wrote: On Jul 12, 2007, at 9:03 AM, Siegfried

How to use bit fields to narrow a search

2007-06-26 Thread Siegfried Goeschl
/ideas where to add this processing within SOLR ... Thanks in advance Siegfried Goeschl

Re: How to use bit fields to narrow a search

2007-06-26 Thread Siegfried Goeschl
Hi Yonik, looks intersting - I give it a try Cheers, Siegfried Goeschl Yonik Seeley wrote: On 6/26/07, Siegfried Goeschl [EMAIL PROTECTED] wrote: Hi folks, I'm currently evaluating SOLR to implement fulltext search and within 8 hours I have my content imported and able to benchmark