Re: Difference between ["" TO *] and [* TO *] at Solr?

2014-04-04 Thread Jack Krupansky
And we can debate what it should or shouldn't be (and just check the code!) - and a clear contract is quite desirable, but this is starting to smell like an XY Problem - what is the user really trying to query - stated simply in English. -- Jack Krupansky -Original Message- From: Eri

Re: Cannot run program "svnversion" when building lucene 4.7.1

2014-04-04 Thread Chris Hostetter
: > I am trying to build lucene 4.7.1 from the sources. I can compile without : > any issues but when I try to build the dist, lucene gives me : > Cannot run program "svnversion" ... The system cannot find the specified : > file. : > : > I am compiling on Windows 7 64-bit using java version 1.7.0.

Re: Solr Search For Documents That Has Empty Content For a Given Particular Field

2014-04-04 Thread Alexandre Rafalovitch
And one solution is to use UpdateRequestProcessor that will create a separate binary field for presence/absence and query on that instead. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Fri, A

Re: How to reduce the search speed of solrcloud

2014-04-04 Thread Alexandre Rafalovitch
And 50 million records of 3 fields each should not become 50Gb of data. Something smells wrong there. Do you have unique IDs setup? Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Sat, Apr 5, 20

Re: SOLR Jetty Server on Windows 2003

2014-04-04 Thread Alexandre Rafalovitch
You might be hitting http://en.wikipedia.org/wiki/Cross-origin_resource_sharing . Something like http://www.telerik.com/fiddler or Wireshark may allow you to see network traffic if you don't have other means. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http:

Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-04 Thread Alexandre Rafalovitch
I like the idea. No comments about implementation, leave it to others. But if it is done, maybe somebody very familiar with logging can also review Solr's current logging config. I suspect it is not optimized for troubleshooting at this point. Regards, Alex. Personal website: http://www.outert

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
error.txt below Java Platform Detected x64 Java Platform Detected -XX:MaxPermSize=64m -Xss256K -Xmx64m -Xms64m -XX:+HeapDumpOnOutOfMemoryError -XX:+CreateMinidumpOnCrash 2014-04-04 15:49:43.341:INFO:oejs.Server:jetty-8.1.8.v20121106 2014-04-04 15:49:43.353:INFO:oejdp.ScanningAppProvider:Deployme

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
Guessing that the attachments won't work, I am pasting one file in each of four separate emails. database.xml On Fri, Apr 4, 2014 at 4:57 PM, Candygram For Mongo < candygram.for.mo...@gmail.com> wrote: > Does this user list allow attachments? I have four files atta

Re: Difference between ["" TO *] and [* TO *] at Solr?

2014-04-04 Thread Erick Erickson
What kind of field are you using? Not quite sure what would happen with a date or numeric field for instance. On Fri, Apr 4, 2014 at 10:28 AM, Furkan KAMACI wrote: > Hİ; > > What is the difference between ["" TO *] and [* TO *] at Solr? (I tested it > at 4.5.1 and numFounds are different. > > Th

Re: Searching multivalue fields.

2014-04-04 Thread Vijay Kokatnur
I had already tested with omitTermFreqAndPositions="false" . I still got the same error. Is there something that I am overlooking? On Fri, Apr 4, 2014 at 2:45 PM, Ahmet Arslan wrote: > Hi Vijay, > > Add omitTermFreqAndPositions="false" attribute to fieldType definitions. > > omitTermFreq

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Ahmet Arslan
Hi, To disable auto commit remove both and parts/definitions from solrconfig.xml To disable tlog remove            ${solr.ulog.dir:}     from solrconfig.xml To commit at the end use commit=true parameter. ?commit=true&command=full-import There is a checkbox for this in data import admin pa

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Ahmet Arslan
Hi, This may not solve your problem but generally it is recommended to disable auto commit and transaction logs for bulk indexing. And issue one commit at the very end. Do you tlogs enabled? I see "commit failed" in the error message thats why I am offering this. And regarding comma separated v

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
I might have forgot to mention that we are using the DataImportHandler. I think we know how to remove auto commit. How would we force a commit at the end? On Fri, Apr 4, 2014 at 3:18 PM, Candygram For Mongo < candygram.for.mo...@gmail.com> wrote: > We would be happy to try that. That sounds c

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
We would be happy to try that. That sounds counter intuitive for the high volume of records we have. Can you help me understand how that might solve our problem? On Fri, Apr 4, 2014 at 2:34 PM, Ahmet Arslan wrote: > Hi, > > Can you remove auto commit for bulk import. Commit at the very end?

Re: Does sorting skip everything having to do with relevancy?

2014-04-04 Thread Shawn Heisey
On 4/4/2014 3:13 PM, Mikhail Khludnev wrote: I suppose SolrIndexSearcher.buildTopDocsCollector() doesn't create a Collector which calls score() in this case. Hence, it shouldn't waste CPU. Just my impression. Haven't you tried to check it supplying some weird formula, which throws exception? I

Re: Filter query with multiple raw/literal ORs

2014-04-04 Thread Yonik Seeley
On Fri, Apr 4, 2014 at 5:28 PM, Mikhail Khludnev wrote: > On Fri, Apr 4, 2014 at 4:08 AM, Yonik Seeley wrote: > >> Try adding a space before the first term, so the >> default lucene query parser will be used: >> > > Yonik, I'm curious, whether it a feature? Yep, it was completely on purpose that

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Ahmet Arslan
Hi, Can you remove auto commit for bulk import. Commit at the very end? Ahmet On Saturday, April 5, 2014 12:16 AM, Candygram For Mongo wrote: In case the attached database.xml file didn't show up, I have pasted in the contents below: On Fri, Apr 4, 2014 at 11

Re: Filter query with multiple raw/literal ORs

2014-04-04 Thread Mikhail Khludnev
On Fri, Apr 4, 2014 at 4:08 AM, Yonik Seeley wrote: > Try adding a space before the first term, so the > default lucene query parser will be used: > Yonik, I'm curious, whether it a feature? -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics

Re: Solr join and lucene scoring

2014-04-04 Thread Mikhail Khludnev
On Thu, Apr 3, 2014 at 1:42 PM, wrote: > Hello, > > referencing to this issue: > https://issues.apache.org/jira/browse/SOLR-4307 > > Is it still not possible with the solr query time join to use scoring? > It's not implemented still. https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/

Re: Does sorting skip everything having to do with relevancy?

2014-04-04 Thread Mikhail Khludnev
Hello Shawn, I suppose SolrIndexSearcher.buildTopDocsCollector() doesn't create a Collector which calls score() in this case. Hence, it shouldn't waste CPU. Just my impression. Haven't you tried to check it supplying some weird formula, which throws exception? On Sat, Apr 5, 2014 at 12:02 AM, Sh

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
In case the attached database.xml file didn't show up, I have pasted in the contents below: On Fri, Apr 4, 2014 at 11:55 AM, Candygram For Mongo < candygram.for.mo...@gmail.com> wrote: > In this case we are indexing an Oracle database. > > We do not include the data

Distributed tracing for Solr via adding HTTP headers?

2014-04-04 Thread Gregg Donovan
We have some metadata -- e.g. a request UUID -- that we log to every log line using Log4J's MDC [1]. The UUID logging allows us to connect any log lines we have for a given request across servers. Sort of like Zipkin [2]. Currently we're using EmbeddedSolrServer without sharding, so adding the UUI

Re: Does sorting skip everything having to do with relevancy?

2014-04-04 Thread Shawn Heisey
On 4/4/2014 1:48 PM, Alvaro Cabrerizo wrote: If you dont want to waste your cpu time, then comment the boost parameter in the query parser defined in your solrconfig.xml. If you cant do that, then you can overwrite it sending the boost parameter for example using the constant function (e.g. htt

Re: Does sorting skip everything having to do with relevancy?

2014-04-04 Thread Alvaro Cabrerizo
Hi, If you dont want to waste your cpu time, then comment the boost parameter in the query parser defined in your solrconfig.xml. If you cant do that, then you can overwrite it sending the boost parameter for example using the constant function (e.g. http:///...&boost=1&sort=your_sort). The

[ JOB ] - Search Specialist, Bloomberg LP [ NY and London ]

2014-04-04 Thread Anirudha Jadhav
http://jobs.bloomberg.com/job/New-York-Search-Technology-Specialist-Job-NY/45497500/ http://jobs.bloomberg.com/job/London-R&D-News-Search-Backend-Developer-Job/50463600/ keeping it short here , feel free to talk to me with more questions -- Anirudha P. Jadhav

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
In this case we are indexing an Oracle database. We do not include the data-config.xml in our distribution. We store the database information in the database.xml file. I have attached the database.xml file. When we use the default merge policy settings, we get the same results. We have not t

How to see the value of "long" type (solr) ?

2014-04-04 Thread Lisheng Zhang
Hi, We use solr 3.6 to index a field of "long" type:

Re: Cannot run program "svnversion" when building lucene 4.7.1

2014-04-04 Thread Ahmet Arslan
Hi, I am not a windows user but if you installed that svnversion should be somewhere on disk. Probably right next to svn. Find/locate it by file search, and add its folder to your path. Once you do that you can invoke svnversion in command line. For example here is the executables in my compute

Re: Cannot run program "svnversion" when building lucene 4.7.1

2014-04-04 Thread Puneet Pawaia
Hi. Yes I installed Tortoise svn. Regards Puneet On 4 Apr 2014 19:35, "Ahmet Arslan" wrote: > Hi, > > When you install subversion, svnversion executable comes with that too. > Did you install any svn client for Windows? > > > > On Friday, April 4, 2014 3:38 PM, Puneet Pawaia > wrote: > Hi all. >

RE: SOLR Jetty Server on Windows 2003

2014-04-04 Thread Doug Turnbull
Are the requests cross domain? Is your browser giving errors about cross domain scripting restrictions in the browser? If you're doing cross domain browser stuff, Solr gives you the ability to do requests over JSONP which is a sneaky hack that gets around these issues. Check out my blog post for an

SOLR Jetty Server on Windows 2003

2014-04-04 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hi , I am trying to install solr on the Windows 2003 with Jetty server. Form browser everything works , but when I try to acesss from another javascript Code in other machine I am not getting reponse. I am using Xmlhttprequest to get the response from server using javascript. Any Help...? --R

Re: How to reduce the search speed of solrcloud

2014-04-04 Thread Anshum Gupta
I am not sure if you setup your SolrCloud right. Can you also provide me with the version of Solr that you're running. Also, if you could tell me about how did you setup your SolrCloud cluster. Are the times consistent? Is this the only collection on the cluster? Also, if I am getting it right, yo

AUTO: Saravanan Chinnadurai is out of the office (returning 08/04/2014)

2014-04-04 Thread Saravanan . Chinnadurai
I am out of the office until 08/04/2014. Please email itsta...@actionimages.com for any urgent queries. Note: This is an automated response to your message "Cannot run program "svnversion" when building lucene 4.7.1" sent on 4/4/2014 13:38:22. This is the only notification you will receive wh

Re: How to reduce the search speed of solrcloud

2014-04-04 Thread Sathya
Hi shawn, I have indexed 50 million data in 5 servers. 3 servers have 8gb ram. One have 24gb and another one have 64gb ram. I was allocate 4 gb ram to solr in each machine. I am using solrcloud. My total index size is 50gb including 5 servers. Each server have 3 zookeepers. Still I didnt check abo

Re: Solr Search For Documents That Has Empty Content For a Given Particular Field

2014-04-04 Thread Chris Hostetter
: "field" : "" // this is the field that I want to learn which document has : it. How you (can) query for a field value like that is going to depend entirely on the FieldTYpe/Analyzer ... if it's a string field, of uses KeywordTokenizer then q=field:"" should find it -- if you use a more tradi

Strange behavior of edismax and mm=0 with long queries (bug?)

2014-04-04 Thread Nils Kaiser
Hey, I am currently using solr to recognize songs and people from a list of user comments. My index stores the titles of the songs. At the moment my application builds word ngrams and fires a search with that query, which works well but is quite inefficient. So my thought was to simply use the co

Re: Solr Search For Documents That Has Empty Content For a Given Particular Field

2014-04-04 Thread Ahmet Arslan
Hi, Weird, for type="string" it works for me. What is the field type you are using?  On Friday, April 4, 2014 6:25 PM, Furkan KAMACI wrote: Hi; II tried it before but does not work 2014-04-04 18:08 GMT+03:00 Ahmet Arslan : Hi Furkan, > >q=fiel:""&fl=field works for me (4.7.0).  > >Ahmet >

Re: Solr Search For Documents That Has Empty Content For a Given Particular Field

2014-04-04 Thread Furkan KAMACI
Hi; II tried it before but does not work 2014-04-04 18:08 GMT+03:00 Ahmet Arslan : > Hi Furkan, > > q=fiel:""&fl=field works for me (4.7.0). > > Ahmet > > > On Friday, April 4, 2014 5:50 PM, Furkan KAMACI > wrote: > Hi; > > How can I find the documents that has empty content for a given field.

Re: Solr Search For Documents That Has Empty Content For a Given Particular Field

2014-04-04 Thread Ahmet Arslan
Hi Furkan, q=fiel:""&fl=field works for me (4.7.0).  Ahmet On Friday, April 4, 2014 5:50 PM, Furkan KAMACI wrote: Hi; How can I find the documents that has empty content for a given field. I don't mean something like: -field:[* TO *] because it returns the documents that has not given parti

Re: tf and very short text fields

2014-04-04 Thread Tom Burton-West
Thanks Marcus, I was thinking about normalization and was absolutely wrong about setting K1 to zero. I should have taken a look at the algorithm and walked through setting K=0. (This is easier to do looking at the formula in wikipedia http://en.wikipedia.org/wiki/Okapi_BM25 than walking though

Solr Search For Documents That Has Empty Content For a Given Particular Field

2014-04-04 Thread Furkan KAMACI
Hi; How can I find the documents that has empty content for a given field. I don't mean something like: -field:[* TO *] because it returns the documents that has not given particular field. I have documents something like: "field1":"some text", "field2":"some text", "field" : "" // this is the

Difference between ["" TO *] and [* TO *] at Solr?

2014-04-04 Thread Furkan KAMACI
Hİ; What is the difference between ["" TO *] and [* TO *] at Solr? (I tested it at 4.5.1 and numFounds are different. Thanks; Furkan KAMACI

Re: How to reduce the search speed of solrcloud

2014-04-04 Thread Shawn Heisey
On 4/4/2014 1:31 AM, Sathya wrote: > Hi All, > > Hi All, I am new to Solr. And i dont know how to increase the search speed > of solrcloud. I have indexed nearly 4 GB of data. When i am searching a > document using java with solrj, solr takes more 6 seconds to return a query > result. Any one plea

Re: tf and very short text fields

2014-04-04 Thread Ahmet Arslan
Hi, Another dimple approach is:  If you don't use phrase query or phrase boosting, you can set  omitTermFreqAndPositions=true Ahmet On Friday, April 4, 2014 2:38 PM, Markus Jelsma wrote: Hi - In this case Walter, iirc, was looking for two things: no normalization and no flat TF (1f for tf(fl

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Ahmet Arslan
Hi, Which database are you using? Can you send us data-config.xml?  What happens when you use default merge policy settings? What happens when you dump your table to Comma Separated File and fed that file to solr? Ahmet On Friday, April 4, 2014 5:10 PM, Candygram For Mongo wrote: The ramBu

Re: Query and field name with wildcard

2014-04-04 Thread Ahmet Arslan
Hi, bq. possible to search a word over the entire index. You can a get list of all searchable fields (indexed=true) programmatically by  https://wiki.apache.org/solr/LukeRequestHandler And then you can fed this list to qf parameter of (e)dismax. This could be implemented as a custom query parser

Re: Does sorting skip everything having to do with relevancy?

2014-04-04 Thread Shawn Heisey
On 4/4/2014 12:48 AM, Alvaro Cabrerizo wrote: > By default solr is using the sort parameter over the "score field". So if > you overwrite it using other sort field, yes solr will use the parameter > you've provided. Remember, you can use multiple fields for > sorting

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
The ramBufferSizeMB was set to 6MB only on the test system to make the system crash sooner. In production that tag is commented out which I believe forces the default value to be used. On Thu, Apr 3, 2014 at 5:46 PM, Ahmet Arslan wrote: > Hi, > > out of curiosity, why did you set ramBufferSize

Re: Solr Search on Fields name

2014-04-04 Thread Ahmet Arslan
Hi Anurag, It seems that RuleA and RuleB are field names? in that case try this query q=RuleA:[* TO *] OR RuleB:[* TO *] Ahmet On Friday, April 4, 2014 4:15 PM, anuragwalia wrote: Hi, Thank for giving your important time. Problem : I am unable to find a way how can I search Key with "OR"

Re: Cannot run program "svnversion" when building lucene 4.7.1

2014-04-04 Thread Ahmet Arslan
Hi, When you install subversion, svnversion executable comes with that too.  Did you install any svn client for Windows? On Friday, April 4, 2014 3:38 PM, Puneet Pawaia wrote: Hi all. I am trying to build lucene 4.7.1 from the sources. I can compile without any issues but when I try to build

Solr Search on Fields name

2014-04-04 Thread anuragwalia
Hi, Thank for giving your important time. Problem : I am unable to find a way how can I search Key with "OR" operator like if I search Items having "RuleA" OR "RuleE". Format of Indexed Data: 1.0 . 4 2 2 2 Can any one help me out how can prepare SearchQuery for key search. Regard

Cannot run program "svnversion" when building lucene 4.7.1

2014-04-04 Thread Puneet Pawaia
Hi all. I am trying to build lucene 4.7.1 from the sources. I can compile without any issues but when I try to build the dist, lucene gives me Cannot run program "svnversion" ... The system cannot find the specified file. I am compiling on Windows 7 64-bit using java version 1.7.0.45 64-bit. Whe

RE: tf and very short text fields

2014-04-04 Thread Markus Jelsma
Hi - In this case Walter, iirc, was looking for two things: no normalization and no flat TF (1f for tf(float freq) > 0). We know that k1 controls TF saturation but in BM25Similarity you can see that k1 is multiplied by the encoded norm value, taking b also into account. So setting k1 to zero ef

Re: How to reduce the search speed of solrcloud

2014-04-04 Thread Alexandre Rafalovitch
You said your request is 6 seconds when going through the SolrJ client. But it is 1 second (1000 ms) when going directly to Solr bypassing the SolrJ. So, the other 5 seconds must be added outside of Solr. Concentrate on that. Regarding your schema, you used example schema that has a lot of stuff y

Re: How to reduce the search speed of solrcloud

2014-04-04 Thread Sathya
Hi, Sorry, i cant get u alex. Can you please explain me(if you can). Because now only i entered into solr. On Fri, Apr 4, 2014 at 2:20 PM, Alexandre Rafalovitch [via Lucene] < ml-node+s472066n4129077...@n3.nabble.com> wrote: > Well, if the direct browser query is 1000ms and your client query i

Re: Query and field name with wildcard

2014-04-04 Thread Alexandre Rafalovitch
Are you using eDisMax. That gives a lot of options, including field aliasing, including a single name to multiple fields: http://wiki.apache.org/solr/ExtendedDisMax#Field_aliasing_.2F_renaming (with example on p77 of my book http://www.packtpub.com/apache-solr-for-indexing-data/book :-) Regards,

Query and field name with wildcard

2014-04-04 Thread Croci Francesco Luigi (ID SWS)
In my index I have some fields which have the same prefix(rmDocumentTitle, rmDocumentClass, rmDocumentSubclass, rmDocumentArt). Apparently it is not possible to specify a query like this: q = rm* : some_word Is there a way to do this without having to write a long list of ORs? Another question

Re: How to reduce the search speed of solrcloud

2014-04-04 Thread Alexandre Rafalovitch
Well, if the direct browser query is 1000ms and your client query is 6seconds, then it is not Solr itself you need to worry about first. Something must be wrong at the client. Trying timing that bit. Maybe it is writing from the client to your ultimate consumer that's the problem. Regards, Alex

Re: How to reduce the search speed of solrcloud

2014-04-04 Thread Sathya
Hi, I have attached my schema.xml file too. And you are right. I have 50 million documents. When i use solr browser to search a document, it will return within 1000 to 2000 ms. My query looks like this: http://10.10.1.14:5050/solr/set_recent_shard1_replica5/select?q=subject&indent=true On 4/4/1

Re: How to reduce the search speed of solrcloud

2014-04-04 Thread Alexandre Rafalovitch
What does your Solr query looks like (check the Solr backend log if you don't know)? And how many document is that? 50 million? Does not sound like much for 3 fields. And what's the definitions (schema.xml rather than solr.xml). And what happens if you issue the query directly to Solr rather than

Re: How to reduce the search speed of solrcloud

2014-04-04 Thread Sathya
Hi Alex, 33026985 Component Audio\:A Shopping List 2012-01-11 09:02:42.96 This is what i am indexed in solr. I have only 3 fields in index. And i am just indexing id, subject and date of the news articles. Nearly 5 crore documents. Also i have attached my solrconfig and solr.xml file. If u need

Re: How to reduce the search speed of solrcloud

2014-04-04 Thread Alexandre Rafalovitch
Show a sample query string that does that (takes 6 seconds to return). Including all defaults you may have put in solrconfig.xml (if any). That might give us a hint which features you are using and what possible direction you could go in next. For the bonus points, enable debug flag and rows=1 para

How to reduce the search speed of solrcloud

2014-04-04 Thread Sathya
Hi All, Hi All, I am new to Solr. And i dont know how to increase the search speed of solrcloud. I have indexed nearly 4 GB of data. When i am searching a document using java with solrj, solr takes more 6 seconds to return a query result. Any one please help me to reduce the search query time to l

Re: Solr join and lucene scoring

2014-04-04 Thread Alvaro Cabrerizo
Hi, The defect you are referencing is closed with a resolution of *Invalid*, so it seems the scoring is working fine with the join. I've made the next two tests on my own data and seems it is working: *TestA* - fl=id,score - q=notebook - fq={!join from=product_list to=id fromIndex=prod

Re: Boosing Basic

2014-04-04 Thread Alvaro Cabrerizo
Hi, If I were you, I would start reading the edismax documentation. Apart from the wiki, you can find in every distribution a full example with the configuration of the edismax query parser (check the xml node reque