Re: Solr faceting -- sort order

2012-07-19 Thread Toke Eskildsen
On Wed, 2012-07-18 at 20:30 +0200, Christopher Gross wrote: When I do a query, the results that come through retain their original case for this field, like: doc 1 keyword: Blah Blah Blah doc 2 keyword: Yadda Yadda Yadda But when I pull back facets, i get: blah blah blah (1) yadda

Re: How To apply transformation in DIH for multivalued numeric field?

2012-07-19 Thread jmlucjav
I have seen that issue several times, in my case it was always with an id field, mysql db and linux. Same config but on windows did not show that issue. Never got to the bottom of it...as it was an id it was just working as it was unique. -- View this message in context:

Importing index - Real Time or Queued?

2012-07-19 Thread Spadez
Hi, Lets say I am running an auction site. There are 20,000 entries. 100 entries come from an on-site SQL database, the rest come from a generated txt file from scrapped content. I want to import any new SQL results onto the server as quickly as possible so they are searchable but I dont want to

maxScore returned with distributed search

2012-07-19 Thread Markus Jelsma
Hi, Why is maxScore always returned with distributed search? It used to return only if score was part of fl. Bug? Feature? Thanks Markus

Re: Importing index - Real Time or Queued?

2012-07-19 Thread Toke Eskildsen
On Thu, 2012-07-19 at 12:54 +0200, Spadez wrote: I want to import any new SQL results onto the server as quickly as possible so they are searchable but I dont want to overload the server. These are my new options: 1. Devise a script to run when a new SQL item is posted, to immediatly import

NGram Indexing Basic Question

2012-07-19 Thread Husain, Yavar
I have set some of my fields to be NGram Indexed. Have also set analyzer both at query as well as index level. Most of the stuff works fine except for use cases where I simply interchange couple of characters. For an example: springfield retrieves correct matches, springfi retrieves correct

Re: Importing index - Real Time or Queued?

2012-07-19 Thread Spadez
Thank you for the reply. Ok, well that brings another question. I dont like pre-optimisation, but I also dont like inefficiency, so lets see if I can strike a balance. It does seem really poor design to reimport 10,000 documents, when only one needs to be added. I dont like that, can you not

join this mailing list

2012-07-19 Thread Tomsdinary
Hi I wait to join this mailing list. This email (including any attachments) is confidential and may be legally privileged. If you received this email in error, please delete it immediately and do not copy it or use it for any purpose or disclose its

Re: join this mailing list

2012-07-19 Thread Gora Mohanty
On 19 July 2012 10:15, 晋鹏(Tomsdinary) jinpeng...@taobao.com wrote: Hi I wait to join this mailing list. Please see the very first entry under http://lucene.apache.org/solr/discussion.html Regards, Gora

Does defType overrides other settings for default request handler

2012-07-19 Thread amitesh116
Hi, We have used *dismax* in our SOLR config with /defaultOperator=OR/ and some *mm * settings. Recently, we have started using *defType=edismax * in query params. With this change, we have observed significant drop in results count. We doubt that SOLR is using default operator=AND and hence

Re: Importing index - Real Time or Queued?

2012-07-19 Thread Toke Eskildsen
On Thu, 2012-07-19 at 13:49 +0200, Spadez wrote: It does seem really poor design to reimport 10,000 documents, when only one needs to be added. I dont like that, can you not insert a specific entry into Solr rather than reimporting everything? Isn't that what you outlined in your option #1?

DIH is doubling field entries

2012-07-19 Thread Bernd Fehling
While porting from 3.6.1 to 4.x I noticed the doubling content of some fields in my index. Didn't have this with 3.6.1. This can also be seen with luke. I could trace it down to DIH so far. Anyone seen this? I'm using XPathEntityProcessor with RegexTransformer. Will look into this closer

Re: Result docs missing only when shards parameter present in query?

2012-07-19 Thread Erick Erickson
A multiValued uniqueKey really doesn't make any sense. But your log file should have something in it like this: SEVERE: uniqueKey should not be multivalued although it _is_ a bit hard to see on startup unless you've suppressed the INFO level output. See:

Re: Indexing data in csv format

2012-07-19 Thread Erick Erickson
Check your csv file for extraneous data? The other thing to do is look at your logs to see if more informative information is there. THere's really very little info to go on here, you might review: http://wiki.apache.org/solr/UsingMailingListshttp://wiki.apache.org/solr/UsingMailingLists Best

Re: Frustrating differences in fieldNorm between two different versions of solr indexing the same document

2012-07-19 Thread Robert Muir
On Thu, Jul 19, 2012 at 12:10 AM, Aaron Daubman daub...@gmail.com wrote: Greetings, I've been digging in to this for two days now and have come up short - hopefully there is some simple answer I am just not seeing: I have a solr 1.4.1 instance and a solr 3.6.0 instance, both configured as

Re: SOLR 4 ALPHA /terms /browse

2012-07-19 Thread Mark Miller
Can you file two JIRA issues for these? bq. but does return reasonable results when distrib is turned off like so It should default to distrib=false - I don't think /terms is distrib aware/compatible. bq. /browse returns this stack trace to the browser HTTP ERROR 500 We may be able to fix

Re: Solr grouping / facet query

2012-07-19 Thread Erick Erickson
I'm not sure your point 3 makes sense. If you're searching by author, how do you define the four most relevant titles? Relevant to what? If you are searching text of the publications, then displaying authors with no publications seems unhelpful. If you're searching the bios, how do you define

Re: Solr 4 Alpha SolrJ Indexing Issue

2012-07-19 Thread Mark Miller
we really need to resolve that issue soon... On Jul 19, 2012, at 12:08 AM, Briggs Thompson wrote: Yury, Thank you so much! That was it. Man, I spent a good long while trouble shooting this. Probably would have spent quite a bit more time. I appreciate your help!! -Briggs On Wed, Jul

Re: Importing index - Real Time or Queued?

2012-07-19 Thread Spadez
This seems to suggest you have to reindex Solr in its entirety and cant add a single document at a time, is this right? http://stackoverflow.com/questions/11247625/apache-solr-adding-editing-deleting-records-frequently -- View this message in context:

Re: Importing index - Real Time or Queued?

2012-07-19 Thread Michael Della Bitta
You can definitely do a single document at a time, but unless you're using NRT, your changes won't be visible until you do a commit. Doing a commit involves closing Searchers and reopening them, which is semi expensive... depending on how you're doing caching, you wouldn't want to do it too

Re: Importing index - Real Time or Queued?

2012-07-19 Thread Toke Eskildsen
On Thu, 2012-07-19 at 16:00 +0200, Spadez wrote: This seems to suggest you have to reindex Solr in its entirety and cant add a single document at a time, is this right? http://stackoverflow.com/questions/11247625/apache-solr-adding-editing-deleting-records-frequently No. What is says is that

Re: Solr grouping / facet query

2012-07-19 Thread s215903406
Thanks for the reply. To clarify, the idea is to search for authors with certain specialties (eg. political, horror, etc.) and if they have any published titles relevant to the user's query, then display those titles next to the author's name. At first, I thought it would be great to have all

Re: Solr faceting -- sort order

2012-07-19 Thread Michael Della Bitta
Maybe I'm not understanding the problem, but I accomplish this by having two fields. One for sorting, like so: fieldType name=sort class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer

Re: Frustrating differences in fieldNorm between two different versions of solr indexing the same document

2012-07-19 Thread Aaron Daubman
Robert, I have a solr 1.4.1 instance and a solr 3.6.0 instance, both configured as identically as possible (given deprecations) and indexing the same document. Why did you do this? If you want the exact same scoring, use the exact same analysis. This means specifying luceneMatchVersion =

RE: How to setup SimpleFSDirectoryFactory

2012-07-19 Thread Uwe Schindler
Read this, then you will see that MMapDirectory will use 0% of your Java Heap space or free system RAM: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de

Re: Frustrating differences in fieldNorm between two different versions of solr indexing the same document

2012-07-19 Thread Robert Muir
On Thu, Jul 19, 2012 at 11:11 AM, Aaron Daubman daub...@gmail.com wrote: Apologies if I didn't clearly state my goal/concern: I am not looking for the exact same scoring - I am looking to explain scoring differences. Deprecated components will eventually go away, time moves on, etc... etc...

Importing data to Solr

2012-07-19 Thread Jonatan Fournier
Hello, I was wondering if there's other ways to import data in Solr than posting xml/json/csv to the server URL (e.g. locally building the index). Is the DataImporter only for database? My data is in an enormous text file that is parsed in python, I get clean json/xml out of it if I want, but

Re: Importing data to Solr

2012-07-19 Thread Michael Della Bitta
Hi Jonatan, Ideally you'd use a Solr API client that allowed batched updates, so you'd be sending documents 100 at a time, say. Alternatively, if you're good with Java, you could build an index by using the EmbeddedSolrServer class in the same process as the code you use to parse the documents.

Re: Importing data to Solr

2012-07-19 Thread Erick Erickson
First, turn off all your soft commit stuff, that won't help in your situation. If you do leave autocommit on, make it a really high number (let's say 1,000,000 to start). You won't have to make 300M calls, you can batch, say, 1,000 docs into each request. DIH supports a bunch of different data

Re: Problem with Solr logging under Jetty

2012-07-19 Thread Rémy Loubradou
Hello, I have a similar problem, anything new about this issue? My problem is that info logs go to stderr and not stdout, do you have an explanation? For the log level I use the file logging.properties with in it only one line setting the level.

Solr Commit not working after delete

2012-07-19 Thread Rohit
We delete some data from solr, post which solr is not accepting any commit's. What could be wrong? We don't see any error in logs or anywhere else. Regards, Rohit

Re: Solr Commit not working after delete

2012-07-19 Thread Brendan Grainger
You might be running into the same issue someone else had the other day: https://issues.apache.org/jira/browse/SOLR-3432 On Jul 19, 2012, at 1:23 PM, Rohit wrote: We delete some data from solr, post which solr is not accepting any commit's. What could be wrong? We don't see any error

RE: Solr Commit not working after delete

2012-07-19 Thread Rohit
Hi Brandan, I am not sure if get whats being suggested. Our delete worked fine, but now no new data is going into the system. Could you please throw some more light. Regards, Rohit -Original Message- From: Brendan Grainger [mailto:brendan.grain...@gmail.com] Sent: 19 July 2012 17:33

Re: Frustrating differences in fieldNorm between two different versions of solr indexing the same document

2012-07-19 Thread Aaron Daubman
Robert, So this is lossy: basically you can think of there being only 256 possible values. So when you increased the number of terms only slightly by changing your analysis, this happened to bump you over the edge rounding you up to the next value. more information:

Re: Solr 4 Alpha SolrJ Indexing Issue

2012-07-19 Thread Briggs Thompson
This is unrelated for the most part, but the javabin update request handler does not seem to be working properly when calling solrj method*HttpSolrServer.deleteById(ListString ids) *. A single Id gets deleted from the index as opposed to the full list. It appears properly in the logs - shows

Re: Reg issue with indexing data from one of the sqlserver DB

2012-07-19 Thread Michael Della Bitta
Your password has an in it. Since this is an XML file, you need to turn it into an XML entity, so your password should be entered as: 8tyamp;2ty=6 Michael Della Bitta Appinions, Inc. -- Where Influence Isn’t a Game. http://www.appinions.com On

Re: Solr 4 Alpha SolrJ Indexing Issue

2012-07-19 Thread Mark Miller
https://issues.apache.org/jira/browse/SOLR-3649 On Thu, Jul 19, 2012 at 3:34 PM, Briggs Thompson w.briggs.thomp...@gmail.com wrote: This is unrelated for the most part, but the javabin update request handler does not seem to be working properly when calling solrj

Re: solr 4.0 cloud 303 error

2012-07-19 Thread Mark Miller
That's really odd - never seen or heard anything like it. A 303 is what a server will respond with if you should GET a different URI... This won't happen out of the box that I've ever seen...can you tells us about any customization's you have made? On Thu, Jul 19, 2012 at 1:08 PM, John-Paul

RE: solr 4.0 cloud 303 error

2012-07-19 Thread John-Paul Drawneek
This is just out of the box. All I did was download solr 4 Alpha from the site. unpack follow instructions from wiki. admin console worked - great try to do a search - throws 303 error Downloaded nightly build, same issue. Also got errors from the other shard with error connecting due to

Re: solr 4.0 cloud 303 error

2012-07-19 Thread Mark Miller
Okay - I'll do the same in a bit and report back. On Jul 19, 2012, at 5:23 PM, John-Paul Drawneek wrote: This is just out of the box. All I did was download solr 4 Alpha from the site. unpack follow instructions from wiki. admin console worked - great try to do a search - throws 303

Re: solr 4.0 cloud 303 error

2012-07-19 Thread Chris Hostetter
: try to do a search - throws 303 error Can you be specific about how exactly you did the search? Was this from the admin UI? what URL was in your browser location bar? what values did you put in the form? what buttons did you click? what URL was in your browser location bar when the error

Re: Count is inconsistent between facet and stats

2012-07-19 Thread Chris Hostetter
: So from StatsComponent the count for 'electronics' cat is 3, while : FacetComponent report 14 'electronics'. Is this a bug? : : Following is the field definition for 'cat'. : field name=cat type=string indexed=true stored=true : multiValued=true/ FYI...

RE: solr 4.0 cloud 303 error

2012-07-19 Thread John-Paul Drawneek
I did a search via both admin UI and /search What I searched for was *:* as that was default in the search box in the admin ui (so expected something that was not an 303 error). Will post url and server logs tomorrow when I am back in the office. But i think the admin url was not anything odd.

Re: Is it possible to alias a facet field?

2012-07-19 Thread Chris Hostetter
: facet.field=testfieldfacet.field=%7B!key=mylabel%7Dtestfieldf.mylabel.limit=1 : : but the limit on the alias didn't seem to work. Is this expected? : : Per-field params don't currently look under the alias. I believe : there's a JIRA open for this.

Re: Solr 4 Alpha SolrJ Indexing Issue

2012-07-19 Thread Briggs Thompson
Thanks Mark! On Thu, Jul 19, 2012 at 4:07 PM, Mark Miller markrmil...@gmail.com wrote: https://issues.apache.org/jira/browse/SOLR-3649 On Thu, Jul 19, 2012 at 3:34 PM, Briggs Thompson w.briggs.thomp...@gmail.com wrote: This is unrelated for the most part, but the javabin update request

How to Increase the number of connexion on Solr/Tomcat6?

2012-07-19 Thread Bruno Mannina
Dear Solr User, I don't know if it's here that my question must be posted but I'm sure some users have already had my problem. Actually, I do 1556 requests with 4 Http components with my program. If I do these requests without delay (500ms) before sending each requests I have around 10% of

Re: How to Increase the number of connexion on Solr/Tomcat6?

2012-07-19 Thread Michael Della Bitta
Hi Bruno, It's usually the maxThreads attribute in the Connector tag in $CATALINA_HOME/conf/server.xml. But I kind of doubt you're running out of threads... maybe you could post some more details about the system you're running Solr on. Michael Della Bitta

Redirecting SolrQueryRequests to another core with Handler

2012-07-19 Thread Nicholas Ball
What is the best way to redirect a SolrQueryRequest to another core from within a handler (custom SearchHandler)? I've tried to find the SolrCore of the core I want to redirect to and called the execute() method with the same params but it looks like the SolrQueryRequest object already has the

Re: queryResultCache not checked with fieldCollapsing

2012-07-19 Thread Chris Hostetter
: When I run dismax queries I see there are no lookups in the : queryResultCache. If I remove the field collapsing - lookups happen. I : can't find any mention of this anywhere or think of reason why this should I'm not very familiar with the grouping code, but i think the crux of what you

custom sorter

2012-07-19 Thread Siping Liu
Hi, I have requirements to place a document to a pre-determined position for special filter query values, for instance when filter query is fq=(field1:xyz) place document abc as first result (the rest of the result set will be ordered by sort=field2). I guess I have to plug in my Java code as a

Re: How to setup SimpleFSDirectoryFactory

2012-07-19 Thread Bill Bell
Thanks. Are you saying that if we run low on memory, the MMapDirectory will stop using it? The least used memory will be removed from the OS automatically? Isee some paging. Wouldn't paging slow down the querying? My index is 10gb and every 8 hours we get most of it in shared memory. The

Re: help: I always get NULL with row.get(columnName)

2012-07-19 Thread Roy Liu
anyone knows? On Thu, Jul 19, 2012 at 5:48 PM, Roy Liu liuchua...@gmail.com wrote: Hi, When I use Transformer to handle files, I always get NULL with row.get(columnName). anyone knows? -- The following file is *data-config.xml* dataConfig dataSource type=JdbcDataSource