Re: First query to find meta data, second to search. How to group into one?

2012-05-15 Thread Mikhail Khludnev
Hello,

have you checked MoreLikeThis feature?

On Tue, May 15, 2012 at 11:26 PM, Samarendra Pratap wrote:

>   - We are calculating frequency of category ids in these top results. We
>   are not using facets because that gives count for all, relevant or
>   irrelevant, results.
>   - Based on category frequencies within top matching results we are
>   trying to find a few most frequent categories by simple calculation. Now
> we
>   are very confident that these categories are the ones which best suit to
>   our query.
>



-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics


 


Re: - Solr 4.0 - How do I enable JSP support ? ...

2012-05-15 Thread Ryan McKinley
just use the admin UI -- look at the 'cloud' tab


On Tue, May 15, 2012 at 12:53 PM, Naga Vijayapuram  wrote:
> Alright; thanks.  Tried with "-OPTIONS=jsp" and am still seeing this on
> console Š
>
> 2012-05-15 12:47:08.837:INFO:solr:No JSP support.  Check that JSP jars are
> in lib/jsp and that the JSP option has been specified to start.jar
>
> I am trying to go after
> http://localhost:8983/solr/collection1/admin/zookeeper.jsp (or its
> equivalent in 4.0) after going through
> http://wiki.apache.org/solr/SolrCloud
>
> May I know the right zookeeper url in 4.0 please?
>
> Thanks
> Naga
>
>
> On 5/15/12 10:56 AM, "Ryan McKinley"  wrote:
>
>>In 4.0, solr no longer uses JSP, so it is not enabled in the example
>>setup.
>>
>>You can enable JSP in your servlet container using whatever method
>>they provide.  For Jetty, using start.jar, you need to add the command
>>line: java -jar start.jar -OPTIONS=jsp
>>
>>ryan
>>
>>
>>
>>On Mon, May 14, 2012 at 2:34 PM, Naga Vijayapuram 
>>wrote:
>>> Hello,
>>>
>>> How do I enable JSP support in Solr 4.0 ?
>>>
>>> Thanks
>>> Naga
>


Re: Solr Caches

2012-05-15 Thread Otis Gospodnetic
Rahul,

Get SPM for Solr from http://sematext.com/spm and you'll get all the insight 
into your cache utilization you need and more, and through it you will get 
(faster) answers to all your questions if you play with your Solr config 
settings and observe cache metrics in SPM.

Otis

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



>
> From: Rahul R 
>To: solr-user@lucene.apache.org 
>Sent: Tuesday, May 15, 2012 3:20 PM
>Subject: Solr Caches
> 
>Hello,
>I am trying to understand how I can size the caches for my solr powered
>application. Some details on the index and application :
>Solr Version : 1.3
>JDK : 1.5.0_14 32 bit
>OS : Solaris 10
>App Server : Weblogic 10 MP1
>Number of documents : 1 million
>Total number of fields : 1000 (750 strings, 225 int/float/double/long, 25
>boolean)
>Number of fields on which faceting and filtering can be done : 400
>Physical size of  index : 600MB
>Number of unique values for a field : Ranges from 5 - 1000. Average of 150
>-Xms and -Xmx vals for jvm : 3G
>Expected number of concurrent users : 15
>No sorting planned for now
>
>Now I want to set appropriate values for the caches. I have put below some
>of my understanding and questions about the caches. Please correct and
>answer accordingly.
>FilterCache:
>As per the solr wiki, this is used to store an unordered list of Ids of
>matching documents for an fq param.
>So if a query contains two fq params, it will create two separate entries
>for each of these fq params. The value of each entry is the list of ids of
>all documents across the index that match the corresponding fq param. Each
>entry is independent of any other entry.
>A minimum size for filterCache could be (total number of fields * avg
>number of unique values per field) ? Is this correct ? I have not enabled
>.
>Max physical size of the filter cache would be (size * avg byte size of a
>document id * avg number of docs returned per fq param) ?
>
>QueryResultsCache:
>Used to store an ordered list of ids of the documents that match the most
>commonly used searches. So if my query is something like
>q=Status:Active&fq=Org:Apache&fq=Version:13, it will create one entry that
>contains list of ids of documents that match this full query. Is this
>correct ? How can I size my queryResultsCache ? Some entries from
>solrconfig.xml :
>50
>200
>Max physical size of the filterCache would be (size * avg byte size of a
>document id * avg number of docs per query). Is this correct ?
>
>
>documentCache:
>Stores the documents that are stored in the index. So I do two searches
>that return three documents each with 1 document being common between both
>result sets. This will result in 5 entries in the documentCache for the 5
>unique documents that have been returned for the two queries ? Is this
>correct ? For sizing, SolrWiki states that "*The size for the documentCache
>should always be greater than  * *".
>Why do we need the max_concurrent_queries parameter here ? Is it when
>max_results is much lesser than numDocs ? In my case, a q=*:*search is done
>the first time the index is loaded. So, will setting documentCache size to
>numDocs be correct ? Can this be like the max that I need to allocate ?
>Max physical size of document cache would be (size * avg byte size of a
>document in the index). Is this correct ?
>
>Thank you
>
>-Rahul
>
>
>

Re: - When is Solr 4.0 due for Release? ...

2012-05-15 Thread Otis Gospodnetic
Hi Naga,

I'll guess . Fall 2012.

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



>
> From: Naga Vijayapuram 
>To: "solr-user@lucene.apache.org"  
>Sent: Tuesday, May 15, 2012 5:17 PM
>Subject: - When is Solr 4.0 due for Release? ...
> 
>… Any idea, anyone?
>
>Thanks
>Naga
>
>
>

Re: should i upgrade

2012-05-15 Thread Otis Gospodnetic
Hi,

I don't think you can set that, but you may still want to upgrade.  Solr 3.6 
has a lower memory footprint, is faster, and has more features.

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



>
> From: Jon Kirton 
>To: solr-user@lucene.apache.org 
>Sent: Tuesday, May 15, 2012 5:47 PM
>Subject: should i upgrade
> 
>We're running solr v1.4.1 w/ approx 30M - 40M records at any given time.
>Often, socket timeout exceptions occur for a search query.  Is there a
>compelling reason to upgrade?  I.e. can u set a socket timeout in
>solrconfig.xml in the latest version and not in v1.4.1 ?
>
>
>

Re: Editing long Solr URLs - Chrome Extension

2012-05-15 Thread Amit Nithian
Erick

Yes thanks I did see that and am working on a solution to that already.
Hope to post a new revision shortly and eventually migrate to the extension
"store".

Cheers
Amit
On May 15, 2012 9:20 AM, "Erick Erickson"  wrote:

> I think I put one up already, but in case I messed up github, complex
> params like the fq here:
>
> http://localhost:8983/solr/select?q=:&fq={!geofilt sfield=store
> pt=52.67,7.30 d=5}
>
> aren't properly handled.
>
> But I'm already using it occasionally
>
> Erick
>
> On Tue, May 15, 2012 at 10:02 AM, Amit Nithian  wrote:
> > Jan
> >
> > Thanks for your feedback! If possible can you file these requests on the
> > github page for the extension so I can work on them? They sound like
> great
> > ideas and I'll try to incorporate all of them in future releases.
> >
> > Thanks
> > Amit
> > On May 11, 2012 9:57 AM, "Jan Høydahl"  wrote:
> >
> >> I've been testing
> >>
> https://chrome.google.com/webstore/detail/mbnigpeabbgkmbcbhkkbnlidcobbapff?hl=enbutI
>  don't think it's great.
> >>
> >> Great work on this one. Simple and straight forward. A few wishes:
> >> * Sticky mode? This tool would make sense in a sidebar, to do rapid
> >> refinements
> >> * If you edit a value and click "TAB", it is not updated :(
> >> * It should not be necessary to URLencode all non-ascii chars - why not
> >> leave colon, caret (^) etc as is, for better readability?
> >> * Some param values in Solr may be large, such as "fl", "qf" or "bf".
> >> Would be nice if the edit box was multi-line, or perhaps adjusts to the
> >> size of the content
> >>
> >> --
> >> Jan Høydahl, search solution architect
> >> Cominvent AS - www.facebook.com/Cominvent
> >> Solr Training - www.solrtraining.com
> >>
> >> On 11. mai 2012, at 07:32, Amit Nithian wrote:
> >>
> >> > Hey all,
> >> >
> >> > I don't know about you but most of the Solr URLs I issue are fairly
> >> > lengthy full of parameters on the query string and browser location
> >> > bars aren't long enough/have multi-line capabilities. I tried to find
> >> > something that does this but couldn't so I wrote a chrome extension to
> >> > help.
> >> >
> >> > Please check out my blog post on the subject and please let me know if
> >> > something doesn't work or needs improvement. Of course this can work
> >> > for any URL with a query string but my motivation was to help edit my
> >> > long Solr URLs.
> >> >
> >> >
> >>
> http://hokiesuns.blogspot.com/2012/05/manipulating-urls-with-long-query.html
> >> >
> >> > Thanks!
> >> > Amit
> >>
> >>
>


Re: doing a full-import after deleting records in the database - maxDocs

2012-05-15 Thread geeky2
hello 

thanks for the reply

this is the output - docsPending = 0

commits : 1786
autocommit maxDocs : 1000
autocommit maxTime : 6ms
autocommits : 1786
optimizes : 3
rollbacks : 0
expungeDeletes : 0
docsPending : 0
adds : 0
deletesById : 0
deletesByQuery : 0
errors : 0
cumulative_adds : 1787752
cumulative_deletesById : 0
cumulative_deletesByQuery : 3
cumulative_errors : 0 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/doing-a-full-import-after-deleting-records-in-the-database-maxDocs-tp3983948p3983995.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Exception in DataImportHandler (stack overflow)

2012-05-15 Thread Jon Drukman
OK, setting the wait_timeout back to its previous value and adding readOnly
didn't help, I got the stack overflow again.  I re-upped the mysql timeout
value again.

-jsd-


On Tue, May 15, 2012 at 2:42 PM, Jon Drukman  wrote:

> I fixed it for now by upping the wait_timeout on the mysql server.
>  Apparently Solr doesn't like having its connection yanked out from under
> it and/or isn't smart enough to reconnect if the server goes away.  I'll
> set it back the way it was and try your readOnly option.
>
> Is there an option with DataImportHandler to have it transmit one or more
> arbitrary SQL statements after connecting?  If there was, I could just send
> "SET wait_timeout=86400;" after connecting.  That would probably prevent
> this issue.
>
> -jsd-
>
> On Tue, May 15, 2012 at 2:35 PM, Dyer, James wrote:
>
>> Shot in the dark here, but try adding readOnly="true" to your dataSource
>> tag.
>>
>> 
>>
>> This sets autocommit to true and sets the Holdability to
>> ResultSet.CLOSE_CURSORS_AT_COMMIT.  DIH does not explicitly close
>> resultsets and maybe if your JDBC driver also manages this poorly you could
>> end up with strange conditions like the one you're getting?  It could be a
>> case where your data has grown just over the limit your setup can handle
>> under such an unfortunate circumstance.
>>
>> Let me know if this solves it.  If so, we probably should open a bug
>> report and get this fixed in DIH.
>>
>> James Dyer
>> E-Commerce Systems
>> Ingram Content Group
>> (615) 213-4311
>>
>>
>> -Original Message-
>> From: Jon Drukman [mailto:jdruk...@gmail.com]
>> Sent: Tuesday, May 15, 2012 4:12 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Exception in DataImportHandler (stack overflow)
>>
>> i don't think so, my config is straightforward:
>>
>> 
>>  > url="jdbc:mysql://x/xx"
>> user="x" password="x" batchSize="-1" />
>>  
>>>   query="select content_id, description, title, add_date from
>> content_solr where active = '1'">
>>   >  query="select tag_id from tags_assoc where content_id =
>> '${content.content_id}'" />
>>   >  query="select count(1) as likes from votes where content_id =
>> '${content.content_id}'" />
>>   >  query="select sum(views) as views from media_views mv join
>> content_media cm USING (media_id) WHERE cm.content_id =
>> '${content.content_id}'" />
>>
>>  
>> 
>>
>> i'm triggering the import with:
>>
>> http://localhost:8983/solr/dataimport?command=full-import&clean=true&commit=true
>>
>>
>>
>> On Tue, May 15, 2012 at 2:07 PM, Michael Della Bitta <
>> michael.della.bi...@appinions.com> wrote:
>>
>> > Hi, Jon:
>> >
>> > Well, you don't see that every day!
>> >
>> > Is it possible that you have something weird going on in your DDL
>> > and/or queries, like a tree schema that now suddenly has a cyclical
>> > reference?
>> >
>> > Michael
>> >
>> > On Tue, May 15, 2012 at 4:33 PM, Jon Drukman 
>> wrote:
>> > > I have a machine which does a full update using DataImportHandler
>> every
>> > > hour.  It worked up until a little while ago.  I did not change the
>> > > dataconfig.xml or version of Solr.
>> > >
>> > > Here is the beginning of the error in the log (the real thing runs for
>> > > thousands of lines)
>> > >
>> > > 2012-05-15 12:44:30.724166500 SEVERE: Full Import
>> > > failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
>> > > java.lang.StackOverflowError
>> > > 2012-05-15 12:44:30.724168500 at
>> > >
>> >
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:669)
>> > > 2012-05-15 12:44:30.724169500 at
>> > >
>> >
>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
>> > > 2012-05-15 12:44:30.724171500 at
>> > >
>> >
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
>> > > 2012-05-15 12:44:30.724219500 at
>> > >
>> >
>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
>> > > 2012-05-15 12:44:30.724221500 at
>> > >
>> >
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
>> > > 2012-05-15 12:44:30.724223500 at
>> > >
>> >
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
>> > > 2012-05-15 12:44:30.724224500 Caused by: java.lang.StackOverflowError
>> > > 2012-05-15 12:44:30.724225500 at
>> > > java.lang.String.checkBounds(String.java:404)
>> > > 2012-05-15 12:44:30.724234500 at
>> java.lang.String.(String.java:450)
>> > > 2012-05-15 12:44:30.724235500 at
>> java.lang.String.(String.java:523)
>> > > 2012-05-15 12:44:30.724236500 at
>> > > java.net.SocketOutputStream.socketWrite0(Native Method)
>> > > 2012-05-15 12:44:30.724238500 at
>> > > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
>> > > 2012-05-15 12:44:30.724239500 at
>> > > java.net.SocketOutputStream.write(SocketOutputStream.java:153)
>> > > 2012-05-15 12:44:30.724253500 at
>> > > java.io.BufferedOutputStream.flushBuffer(Buf

Distributed search between solrclouds?

2012-05-15 Thread Darren Govoni
Hi,
  Would distributed search (the old way where you provide the solr host
IP's etc.) still work between different solrclouds?

thanks,
Darren



Re: Boosting on field empty or not

2012-05-15 Thread Ahmet Arslan

Just tested to make sure. queryNorm is changing after you add bq parameter. For 
example : 0.00317763 = queryNorm becomes 0.0028020076 = queryNorm. 

Since all scores are multiplied by this queryNorm factor, score of a document ( 
even if it is not effected/boosted by bq) changes.

before &bq=SOURCE:Haberler^100

5.246903
4529806
EnSonHaber


after &bq=SOURCE:Haberler^100

4.626675
4529806
EnSonHaber


Does that makes sense? 

> > If the bq is only supposed apply the
> > boost when the field value is greater
> > than 0.01 why would trying another query make sure this
> is
> > working.
> > 
> > Its applying the boost to all the fields, yes when the
> boost
> > is high enough
> > most of documents with a value GT 0.01 show up first
> however
> > since it is
> > applying the boost to all the documents sometimes
> documents
> > without a value
> > in this field appear before those that do.
> 
> If boosting is applied to all documents, then why result
> order is changing?
> 
> Sometimes documents without a value can show-up before
> because there are other factors that contribute score
> calculation. 
> 
> http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/search/Similarity.html
> 
> If you add &debugQuery=on, you can see detailed
> explanation about how calculation is done.
> 


Re: First query to find meta data, second to search. How to group into one?

2012-05-15 Thread SUJIT PAL
Hi Samarendra,

This does look like a candidate for a custom query component if you want to do 
this inside Solr. You can of course continue to do this at the client.

-sujit

On May 15, 2012, at 12:26 PM, Samarendra Pratap wrote:

> Hi,
> I need a suggestion for improving relevance of search results. Any
> help/pointers are appreciated.
> 
> We have following fields (plus a lot more) in our schema
> 
> title
> description
> category_id (multivalued)
> 
> We are using mm=70% in solrconfig.xml
> We are using qf=title description
> We are not doing phrase query in "q"
> 
> In case of a multi-word search text, mostly the end results are the junk
> ones. Because the words, mentioned in search text, are written in different
> fields and in different contexts.
> For example searching for "water proof" (without double quotes) brings a
> record where title = "rose water" and description = "... no proof of
> contamination ..."
> 
> Our priority is to remove irrelevant results, as much as possible.
> Increasing "mm" will not solve this completely because user input may not
> be always correct to be benefited by high "mm".
> 
> To remove irrelevant records we worked on following solution (or
> work-around)
> 
>   - We are firing first query to get top "n" results. We assume that first
>   "n" results are mostly good results. "n" is dynamic within a predefined
>   minimum and maximum value.
>   - We are calculating frequency of category ids in these top results. We
>   are not using facets because that gives count for all, relevant or
>   irrelevant, results.
>   - Based on category frequencies within top matching results we are
>   trying to find a few most frequent categories by simple calculation. Now we
>   are very confident that these categories are the ones which best suit to
>   our query.
>   - Finally we are firing a second query with top categories, calculated
>   above, in filter query (fq).
> 
> 
> The quality of results really increased very much so I thought to try it
> the standard way.
> Does it require writing a plugin if I want to move above logic into Solr?
> Which component do I need to modify - QueryComponent?
> 
> Or is there any better or even equivalent method in Solr of doing this or
> similar thing?
> 
> 
> 
> Thanks
> 
> -- 
> Regards,
> Samar



should i upgrade

2012-05-15 Thread Jon Kirton
We're running solr v1.4.1 w/ approx 30M - 40M records at any given time.
 Often, socket timeout exceptions occur for a search query.  Is there a
compelling reason to upgrade?  I.e. can u set a socket timeout in
solrconfig.xml in the latest version and not in v1.4.1 ?


Re: Exception in DataImportHandler (stack overflow)

2012-05-15 Thread Jon Drukman
I fixed it for now by upping the wait_timeout on the mysql server.
 Apparently Solr doesn't like having its connection yanked out from under
it and/or isn't smart enough to reconnect if the server goes away.  I'll
set it back the way it was and try your readOnly option.

Is there an option with DataImportHandler to have it transmit one or more
arbitrary SQL statements after connecting?  If there was, I could just send
"SET wait_timeout=86400;" after connecting.  That would probably prevent
this issue.

-jsd-

On Tue, May 15, 2012 at 2:35 PM, Dyer, James wrote:

> Shot in the dark here, but try adding readOnly="true" to your dataSource
> tag.
>
> 
>
> This sets autocommit to true and sets the Holdability to
> ResultSet.CLOSE_CURSORS_AT_COMMIT.  DIH does not explicitly close
> resultsets and maybe if your JDBC driver also manages this poorly you could
> end up with strange conditions like the one you're getting?  It could be a
> case where your data has grown just over the limit your setup can handle
> under such an unfortunate circumstance.
>
> Let me know if this solves it.  If so, we probably should open a bug
> report and get this fixed in DIH.
>
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
>
>
> -Original Message-
> From: Jon Drukman [mailto:jdruk...@gmail.com]
> Sent: Tuesday, May 15, 2012 4:12 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Exception in DataImportHandler (stack overflow)
>
> i don't think so, my config is straightforward:
>
> 
>   url="jdbc:mysql://x/xx"
> user="x" password="x" batchSize="-1" />
>  
>   query="select content_id, description, title, add_date from
> content_solr where active = '1'">
> query="select tag_id from tags_assoc where content_id =
> '${content.content_id}'" />
> query="select count(1) as likes from votes where content_id =
> '${content.content_id}'" />
> query="select sum(views) as views from media_views mv join
> content_media cm USING (media_id) WHERE cm.content_id =
> '${content.content_id}'" />
>
>  
> 
>
> i'm triggering the import with:
>
> http://localhost:8983/solr/dataimport?command=full-import&clean=true&commit=true
>
>
>
> On Tue, May 15, 2012 at 2:07 PM, Michael Della Bitta <
> michael.della.bi...@appinions.com> wrote:
>
> > Hi, Jon:
> >
> > Well, you don't see that every day!
> >
> > Is it possible that you have something weird going on in your DDL
> > and/or queries, like a tree schema that now suddenly has a cyclical
> > reference?
> >
> > Michael
> >
> > On Tue, May 15, 2012 at 4:33 PM, Jon Drukman  wrote:
> > > I have a machine which does a full update using DataImportHandler every
> > > hour.  It worked up until a little while ago.  I did not change the
> > > dataconfig.xml or version of Solr.
> > >
> > > Here is the beginning of the error in the log (the real thing runs for
> > > thousands of lines)
> > >
> > > 2012-05-15 12:44:30.724166500 SEVERE: Full Import
> > > failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
> > > java.lang.StackOverflowError
> > > 2012-05-15 12:44:30.724168500 at
> > >
> >
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:669)
> > > 2012-05-15 12:44:30.724169500 at
> > >
> >
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
> > > 2012-05-15 12:44:30.724171500 at
> > >
> >
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
> > > 2012-05-15 12:44:30.724219500 at
> > >
> >
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
> > > 2012-05-15 12:44:30.724221500 at
> > >
> >
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
> > > 2012-05-15 12:44:30.724223500 at
> > >
> >
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
> > > 2012-05-15 12:44:30.724224500 Caused by: java.lang.StackOverflowError
> > > 2012-05-15 12:44:30.724225500 at
> > > java.lang.String.checkBounds(String.java:404)
> > > 2012-05-15 12:44:30.724234500 at
> java.lang.String.(String.java:450)
> > > 2012-05-15 12:44:30.724235500 at
> java.lang.String.(String.java:523)
> > > 2012-05-15 12:44:30.724236500 at
> > > java.net.SocketOutputStream.socketWrite0(Native Method)
> > > 2012-05-15 12:44:30.724238500 at
> > > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
> > > 2012-05-15 12:44:30.724239500 at
> > > java.net.SocketOutputStream.write(SocketOutputStream.java:153)
> > > 2012-05-15 12:44:30.724253500 at
> > > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > > 2012-05-15 12:44:30.724254500 at
> > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > > 2012-05-15 12:44:30.724256500 at
> > > com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3345)
> > > 2012-05-15 12:44:30.724257500 at
> > > com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1983)
> > > 2012-05-15 12:44:30.724259500 at
> > > com

Re: Boosting on field empty or not

2012-05-15 Thread Ahmet Arslan

> If the bq is only supposed apply the
> boost when the field value is greater
> than 0.01 why would trying another query make sure this is
> working.
> 
> Its applying the boost to all the fields, yes when the boost
> is high enough
> most of documents with a value GT 0.01 show up first however
> since it is
> applying the boost to all the documents sometimes documents
> without a value
> in this field appear before those that do.

If boosting is applied to all documents, then why result order is changing?

Sometimes documents without a value can show-up before because there are other 
factors that contribute score calculation. 

http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/search/Similarity.html

If you add &debugQuery=on, you can see detailed explanation about how 
calculation is done.


RE: Exception in DataImportHandler (stack overflow)

2012-05-15 Thread Dyer, James
Shot in the dark here, but try adding readOnly="true" to your dataSource tag.



This sets autocommit to true and sets the Holdability to 
ResultSet.CLOSE_CURSORS_AT_COMMIT.  DIH does not explicitly close resultsets 
and maybe if your JDBC driver also manages this poorly you could end up with 
strange conditions like the one you're getting?  It could be a case where your 
data has grown just over the limit your setup can handle under such an 
unfortunate circumstance.

Let me know if this solves it.  If so, we probably should open a bug report and 
get this fixed in DIH.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Jon Drukman [mailto:jdruk...@gmail.com] 
Sent: Tuesday, May 15, 2012 4:12 PM
To: solr-user@lucene.apache.org
Subject: Re: Exception in DataImportHandler (stack overflow)

i don't think so, my config is straightforward:


  
  

   
   
   

  


i'm triggering the import with:
http://localhost:8983/solr/dataimport?command=full-import&clean=true&commit=true



On Tue, May 15, 2012 at 2:07 PM, Michael Della Bitta <
michael.della.bi...@appinions.com> wrote:

> Hi, Jon:
>
> Well, you don't see that every day!
>
> Is it possible that you have something weird going on in your DDL
> and/or queries, like a tree schema that now suddenly has a cyclical
> reference?
>
> Michael
>
> On Tue, May 15, 2012 at 4:33 PM, Jon Drukman  wrote:
> > I have a machine which does a full update using DataImportHandler every
> > hour.  It worked up until a little while ago.  I did not change the
> > dataconfig.xml or version of Solr.
> >
> > Here is the beginning of the error in the log (the real thing runs for
> > thousands of lines)
> >
> > 2012-05-15 12:44:30.724166500 SEVERE: Full Import
> > failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
> > java.lang.StackOverflowError
> > 2012-05-15 12:44:30.724168500 at
> >
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:669)
> > 2012-05-15 12:44:30.724169500 at
> >
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
> > 2012-05-15 12:44:30.724171500 at
> >
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
> > 2012-05-15 12:44:30.724219500 at
> >
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
> > 2012-05-15 12:44:30.724221500 at
> >
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
> > 2012-05-15 12:44:30.724223500 at
> >
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
> > 2012-05-15 12:44:30.724224500 Caused by: java.lang.StackOverflowError
> > 2012-05-15 12:44:30.724225500 at
> > java.lang.String.checkBounds(String.java:404)
> > 2012-05-15 12:44:30.724234500 at java.lang.String.(String.java:450)
> > 2012-05-15 12:44:30.724235500 at java.lang.String.(String.java:523)
> > 2012-05-15 12:44:30.724236500 at
> > java.net.SocketOutputStream.socketWrite0(Native Method)
> > 2012-05-15 12:44:30.724238500 at
> > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
> > 2012-05-15 12:44:30.724239500 at
> > java.net.SocketOutputStream.write(SocketOutputStream.java:153)
> > 2012-05-15 12:44:30.724253500 at
> > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > 2012-05-15 12:44:30.724254500 at
> > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > 2012-05-15 12:44:30.724256500 at
> > com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3345)
> > 2012-05-15 12:44:30.724257500 at
> > com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1983)
> > 2012-05-15 12:44:30.724259500 at
> > com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2163)
> > 2012-05-15 12:44:30.724267500 at
> > com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2618)
> > 2012-05-15 12:44:30.724268500 at
> >
> com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1644)
> > 2012-05-15 12:44:30.724270500 at
> > com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:198)
> > 2012-05-15 12:44:30.724271500 at
> > com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7617)
> > 2012-05-15 12:44:30.724273500 at
> > com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:907)
> > 2012-05-15 12:44:30.724280500 at
> > com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2478)
> > 2012-05-15 12:44:30.724282500 at
> >
> com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1584)
> > 2012-05-15 12:44:30.724283500 at
> > com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4364)
> > 2012-05-15 12:44:30.724285500 at
> > com.mysql.jdbc.ConnectionImpl.cleanup(ConnectionImpl.java:1360)
> > 2012-05-15 12:44:30.724286500 at
> > com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2652)
> > 2012-05-15 12:44:30.724321500 at
> >
> com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1644)
> > 2012-05-15 12:44:30.724322500 at
> > com.mysql.jdbc.RowDataDynamic

- When is Solr 4.0 due for Release? ...

2012-05-15 Thread Naga Vijayapuram
… Any idea, anyone?

Thanks
Naga


Re: - Solr 4.0 - How do I enable JSP support ? ...

2012-05-15 Thread Naga Vijayapuram
Finally got a handle on this by looking into the New Admin UI -
http://localhost:8983/solr/#/~cloud

Thanks
Naga


On 5/15/12 12:53 PM, "Naga Vijayapuram"  wrote:

>Alright; thanks.  Tried with "-OPTIONS=jsp" and am still seeing this on
>console Š
>
>2012-05-15 12:47:08.837:INFO:solr:No JSP support.  Check that JSP jars are
>in lib/jsp and that the JSP option has been specified to start.jar
>
>I am trying to go after
>http://localhost:8983/solr/collection1/admin/zookeeper.jsp (or its
>equivalent in 4.0) after going through
>http://wiki.apache.org/solr/SolrCloud
>
>May I know the right zookeeper url in 4.0 please?
>
>Thanks
>Naga
>
>
>On 5/15/12 10:56 AM, "Ryan McKinley"  wrote:
>
>>In 4.0, solr no longer uses JSP, so it is not enabled in the example
>>setup.
>>
>>You can enable JSP in your servlet container using whatever method
>>they provide.  For Jetty, using start.jar, you need to add the command
>>line: java -jar start.jar -OPTIONS=jsp
>>
>>ryan
>>
>>
>>
>>On Mon, May 14, 2012 at 2:34 PM, Naga Vijayapuram 
>>wrote:
>>> Hello,
>>>
>>> How do I enable JSP support in Solr 4.0 ?
>>>
>>> Thanks
>>> Naga
>



Re: Exception in DataImportHandler (stack overflow)

2012-05-15 Thread Jon Drukman
i don't think so, my config is straightforward:


  
  

   
   
   

  


i'm triggering the import with:
http://localhost:8983/solr/dataimport?command=full-import&clean=true&commit=true



On Tue, May 15, 2012 at 2:07 PM, Michael Della Bitta <
michael.della.bi...@appinions.com> wrote:

> Hi, Jon:
>
> Well, you don't see that every day!
>
> Is it possible that you have something weird going on in your DDL
> and/or queries, like a tree schema that now suddenly has a cyclical
> reference?
>
> Michael
>
> On Tue, May 15, 2012 at 4:33 PM, Jon Drukman  wrote:
> > I have a machine which does a full update using DataImportHandler every
> > hour.  It worked up until a little while ago.  I did not change the
> > dataconfig.xml or version of Solr.
> >
> > Here is the beginning of the error in the log (the real thing runs for
> > thousands of lines)
> >
> > 2012-05-15 12:44:30.724166500 SEVERE: Full Import
> > failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
> > java.lang.StackOverflowError
> > 2012-05-15 12:44:30.724168500 at
> >
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:669)
> > 2012-05-15 12:44:30.724169500 at
> >
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
> > 2012-05-15 12:44:30.724171500 at
> >
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
> > 2012-05-15 12:44:30.724219500 at
> >
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
> > 2012-05-15 12:44:30.724221500 at
> >
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
> > 2012-05-15 12:44:30.724223500 at
> >
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
> > 2012-05-15 12:44:30.724224500 Caused by: java.lang.StackOverflowError
> > 2012-05-15 12:44:30.724225500 at
> > java.lang.String.checkBounds(String.java:404)
> > 2012-05-15 12:44:30.724234500 at java.lang.String.(String.java:450)
> > 2012-05-15 12:44:30.724235500 at java.lang.String.(String.java:523)
> > 2012-05-15 12:44:30.724236500 at
> > java.net.SocketOutputStream.socketWrite0(Native Method)
> > 2012-05-15 12:44:30.724238500 at
> > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
> > 2012-05-15 12:44:30.724239500 at
> > java.net.SocketOutputStream.write(SocketOutputStream.java:153)
> > 2012-05-15 12:44:30.724253500 at
> > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > 2012-05-15 12:44:30.724254500 at
> > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > 2012-05-15 12:44:30.724256500 at
> > com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3345)
> > 2012-05-15 12:44:30.724257500 at
> > com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1983)
> > 2012-05-15 12:44:30.724259500 at
> > com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2163)
> > 2012-05-15 12:44:30.724267500 at
> > com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2618)
> > 2012-05-15 12:44:30.724268500 at
> >
> com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1644)
> > 2012-05-15 12:44:30.724270500 at
> > com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:198)
> > 2012-05-15 12:44:30.724271500 at
> > com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7617)
> > 2012-05-15 12:44:30.724273500 at
> > com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:907)
> > 2012-05-15 12:44:30.724280500 at
> > com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2478)
> > 2012-05-15 12:44:30.724282500 at
> >
> com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1584)
> > 2012-05-15 12:44:30.724283500 at
> > com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4364)
> > 2012-05-15 12:44:30.724285500 at
> > com.mysql.jdbc.ConnectionImpl.cleanup(ConnectionImpl.java:1360)
> > 2012-05-15 12:44:30.724286500 at
> > com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2652)
> > 2012-05-15 12:44:30.724321500 at
> >
> com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1644)
> > 2012-05-15 12:44:30.724322500 at
> > com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:198)
> > 2012-05-15 12:44:30.724324500 at
> > com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7617)
> > 2012-05-15 12:44:30.724325500 at
> > com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:907)
> > 2012-05-15 12:44:30.724327500 at
> > com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2478)
> > 2012-05-15 12:44:30.724334500 at
> >
> com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1584)
> > 2012-05-15 12:44:30.724335500 at
> > com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4364)
> > 2012-05-15 12:44:30.724336500 at
> > com.mysql.jdbc.ConnectionImpl.cleanup(ConnectionImpl.java:1360)
> > 2012-05-15 12:44:30.724338500 at
> > com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2652)
> > 2012-05-15 12:44:30.724339500 at
> >
> com.mysql.jdbc.StatementImp

Re: doing a full-import after deleting records in the database - maxDocs

2012-05-15 Thread Michael Della Bitta
Hello, geeky2:

In statistics in the update section, do you see a non-zero value for
docsPending?

Thanks,

Michael

On Tue, May 15, 2012 at 4:49 PM, geeky2  wrote:
>
> hello,
>
> After doing a DIH full-import (with clean=true) after deleting records in
> the database, i noticed that the number of documents processed, did change.
>
>
> example:
>
> Indexing completed. Added/Updated: 595908 documents. Deleted 0 documents.
>
> however, i noticed the numbers on the statistics page did not change nor do
> they match the number of indexed records -
>
>
> can someone help me understand the difference in these numbers and the
> meaning of maxDoc / numDoc?
>
> numDocs : 594893
> maxDoc : 594893
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/doing-a-full-import-after-deleting-records-in-the-database-maxDocs-tp3983948.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Exception in DataImportHandler (stack overflow)

2012-05-15 Thread Michael Della Bitta
Hi, Jon:

Well, you don't see that every day!

Is it possible that you have something weird going on in your DDL
and/or queries, like a tree schema that now suddenly has a cyclical
reference?

Michael

On Tue, May 15, 2012 at 4:33 PM, Jon Drukman  wrote:
> I have a machine which does a full update using DataImportHandler every
> hour.  It worked up until a little while ago.  I did not change the
> dataconfig.xml or version of Solr.
>
> Here is the beginning of the error in the log (the real thing runs for
> thousands of lines)
>
> 2012-05-15 12:44:30.724166500 SEVERE: Full Import
> failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.StackOverflowError
> 2012-05-15 12:44:30.724168500 at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:669)
> 2012-05-15 12:44:30.724169500 at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
> 2012-05-15 12:44:30.724171500 at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
> 2012-05-15 12:44:30.724219500 at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
> 2012-05-15 12:44:30.724221500 at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
> 2012-05-15 12:44:30.724223500 at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
> 2012-05-15 12:44:30.724224500 Caused by: java.lang.StackOverflowError
> 2012-05-15 12:44:30.724225500 at
> java.lang.String.checkBounds(String.java:404)
> 2012-05-15 12:44:30.724234500 at java.lang.String.(String.java:450)
> 2012-05-15 12:44:30.724235500 at java.lang.String.(String.java:523)
> 2012-05-15 12:44:30.724236500 at
> java.net.SocketOutputStream.socketWrite0(Native Method)
> 2012-05-15 12:44:30.724238500 at
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
> 2012-05-15 12:44:30.724239500 at
> java.net.SocketOutputStream.write(SocketOutputStream.java:153)
> 2012-05-15 12:44:30.724253500 at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> 2012-05-15 12:44:30.724254500 at
> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> 2012-05-15 12:44:30.724256500 at
> com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3345)
> 2012-05-15 12:44:30.724257500 at
> com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1983)
> 2012-05-15 12:44:30.724259500 at
> com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2163)
> 2012-05-15 12:44:30.724267500 at
> com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2618)
> 2012-05-15 12:44:30.724268500 at
> com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1644)
> 2012-05-15 12:44:30.724270500 at
> com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:198)
> 2012-05-15 12:44:30.724271500 at
> com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7617)
> 2012-05-15 12:44:30.724273500 at
> com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:907)
> 2012-05-15 12:44:30.724280500 at
> com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2478)
> 2012-05-15 12:44:30.724282500 at
> com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1584)
> 2012-05-15 12:44:30.724283500 at
> com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4364)
> 2012-05-15 12:44:30.724285500 at
> com.mysql.jdbc.ConnectionImpl.cleanup(ConnectionImpl.java:1360)
> 2012-05-15 12:44:30.724286500 at
> com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2652)
> 2012-05-15 12:44:30.724321500 at
> com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1644)
> 2012-05-15 12:44:30.724322500 at
> com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:198)
> 2012-05-15 12:44:30.724324500 at
> com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7617)
> 2012-05-15 12:44:30.724325500 at
> com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:907)
> 2012-05-15 12:44:30.724327500 at
> com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2478)
> 2012-05-15 12:44:30.724334500 at
> com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1584)
> 2012-05-15 12:44:30.724335500 at
> com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4364)
> 2012-05-15 12:44:30.724336500 at
> com.mysql.jdbc.ConnectionImpl.cleanup(ConnectionImpl.java:1360)
> 2012-05-15 12:44:30.724338500 at
> com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2652)
> 2012-05-15 12:44:30.724339500 at
> com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1644)
> 2012-05-15 12:44:30.724345500 at
> com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:198)
> 2012-05-15 12:44:30.724347500 at
> com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7617)
> 2012-05-15 12:44:30.724348500 at
> com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:907)
> 2012-05-15 12:44:30.724350500 at
> com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2478)
> 2012-05-15 12:44:30.724351500 at
> com.mysql.jdbc.ConnectionImpl.clo

Re: Boosting on field empty or not

2012-05-15 Thread Donald Organ
If the bq is only supposed apply the boost when the field value is greater
than 0.01 why would trying another query make sure this is working.

Its applying the boost to all the fields, yes when the boost is high enough
most of documents with a value GT 0.01 show up first however since it is
applying the boost to all the documents sometimes documents without a value
in this field appear before those that do.



On Tue, May 15, 2012 at 4:51 PM, Ahmet Arslan  wrote:

> > Scratch that...it still seems to be
> > boosting documents where the value of
> > the field is empty.
> >
> >
> > bq=regularprice:[0.01 TO *]^50
> >
> > Results with bq set:
> >
> > 
> >  > name="score">2.2172112
> >  > name="code">bhl-ltab-30
> >   
> >
> >
> > Results without bq set:
> >
> > 
> >  > name="score">2.4847748
> >  > name="code">bhl-ltab-30
> >   
> >
>
> Important thing is the order. Does the order of results change in a way
> that you want? (When you add bq)
>
> It is not a good idea to compare scores of two different queries. I
> *think* queryNorm is causing this difference.
> You can add debugQuery=on and see what is the difference.
>
>


Re: Boosting on field empty or not

2012-05-15 Thread Ahmet Arslan
> Scratch that...it still seems to be
> boosting documents where the value of
> the field is empty.
> 
> 
> bq=regularprice:[0.01 TO *]^50
> 
> Results with bq set:
> 
> 
>      name="score">2.2172112
>      name="code">bhl-ltab-30
>   
> 
> 
> Results without bq set:
> 
> 
>      name="score">2.4847748
>      name="code">bhl-ltab-30
>   
> 

Important thing is the order. Does the order of results change in a way that 
you want? (When you add bq) 

It is not a good idea to compare scores of two different queries. I *think* 
queryNorm is causing this difference.
You can add debugQuery=on and see what is the difference.



doing a full-import after deleting records in the database - maxDocs

2012-05-15 Thread geeky2

hello,

After doing a DIH full-import (with clean=true) after deleting records in
the database, i noticed that the number of documents processed, did change.


example:

Indexing completed. Added/Updated: 595908 documents. Deleted 0 documents.

however, i noticed the numbers on the statistics page did not change nor do
they match the number of indexed records -


can someone help me understand the difference in these numbers and the
meaning of maxDoc / numDoc?

numDocs : 594893
maxDoc : 594893 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/doing-a-full-import-after-deleting-records-in-the-database-maxDocs-tp3983948.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Replacing payloads for per-document-per-keyword scores

2012-05-15 Thread Mikhail Khludnev
Hello Neil,

if "manipulating tf" is a possible approach, why don't extend
KeywordTokenizer to make it work in the following manner:

"3|wheel" -> {wheel,wheel,wheel}

it will allow supply your per-term-per-doc boosts as a prefixes for field
values and multiply them during indexing internally.

The second consideration is - have you considered Click Scoring Tools from
lucidworks as a relevant approach?

Regards

On Wed, May 16, 2012 at 12:02 AM, Neil Hooey  wrote:

> Hello Hoss and the list,
>
> We are currently using Lucene payloads to store per-document-per-keyword
> scores for our dataset. Our dataset consists of photos with keywords
> assigned (only once each) to them. The index is about 90 GB, running on
> 24-core machines with dedicated 10k SAS drives, and 16/32 GB allocated to
> the JVM.
>
> When searching the payloads field, our 98 percentile query time is at 2
> seconds even with trivially low queries per second. I have asked several
> Lucene committers about this and it's believed that the implementation of
> payloads being so general is the cause of the slowness.
>
> Hoss guessed that we could override Term Frequency with PreAnalyzedField[1]
> for the per-keyword scores, since keywords (tags) always have a Term
> Frequency of 1 and the TF calculation is very fast. However it turns out
> that you can't[2] specify TF in the PreAnalyzedField.
>
> Is there any other way to override Term Frequency during index time? If
> not, where in the code could this be implemented?
>
> An obvious option is to repeat the keyword as many times as its payload
> score, but that would drastically increase the amount of data per document
> sent during index time.
>
> I'd welcome any other per-document-per-keyword score solutions, or some way
> to speed up searching a payload field.
>
> Thanks,
>
> - Neil
>
> [1] https://issues.apache.org/jira/browse/SOLR-1535
> [2]
>
> https://issues.apache.org/jira/browse/SOLR-1535?focusedCommentId=13273501#comment-13273501
>



-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics


 


Re: Show a portion of searchable text in Solr

2012-05-15 Thread anarchos78
Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Show-a-portion-of-searchable-text-in-Solr-tp3983613p3983942.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: apostrophe / ayn / alif

2012-05-15 Thread Robert Muir
On Tue, May 15, 2012 at 2:47 PM, Naomi Dushay  wrote:
> We are using the ICUFoldingFilterFactory with great success to fold 
> diacritics so searches with and without the diacritics get the same results.
>
> We recently discovered we have some Korean records that use an alif diacritic 
> instead of an apostrophe, and this diacritic is NOT getting folded.   Has 
> anyone experienced this for alif or ayn characters?   Do you have a solution?
>

What do you mean alif diacritic in Korean? Alif (ا) isn't a diacritic
and isn't used in Korean.

Or did you mean arabic dagger alif ( ٰ ) ? This is not a diacritic in
unicode (though its a combining mark).


-- 
lucidimagination.com


Replacing payloads for per-document-per-keyword scores

2012-05-15 Thread Neil Hooey
Hello Hoss and the list,

We are currently using Lucene payloads to store per-document-per-keyword
scores for our dataset. Our dataset consists of photos with keywords
assigned (only once each) to them. The index is about 90 GB, running on
24-core machines with dedicated 10k SAS drives, and 16/32 GB allocated to
the JVM.

When searching the payloads field, our 98 percentile query time is at 2
seconds even with trivially low queries per second. I have asked several
Lucene committers about this and it's believed that the implementation of
payloads being so general is the cause of the slowness.

Hoss guessed that we could override Term Frequency with PreAnalyzedField[1]
for the per-keyword scores, since keywords (tags) always have a Term
Frequency of 1 and the TF calculation is very fast. However it turns out
that you can't[2] specify TF in the PreAnalyzedField.

Is there any other way to override Term Frequency during index time? If
not, where in the code could this be implemented?

An obvious option is to repeat the keyword as many times as its payload
score, but that would drastically increase the amount of data per document
sent during index time.

I'd welcome any other per-document-per-keyword score solutions, or some way
to speed up searching a payload field.

Thanks,

- Neil

[1] https://issues.apache.org/jira/browse/SOLR-1535
[2]
https://issues.apache.org/jira/browse/SOLR-1535?focusedCommentId=13273501#comment-13273501


Re: - Solr 4.0 - How do I enable JSP support ? ...

2012-05-15 Thread Naga Vijayapuram
Alright; thanks.  Tried with "-OPTIONS=jsp" and am still seeing this on
console Š

2012-05-15 12:47:08.837:INFO:solr:No JSP support.  Check that JSP jars are
in lib/jsp and that the JSP option has been specified to start.jar

I am trying to go after
http://localhost:8983/solr/collection1/admin/zookeeper.jsp (or its
equivalent in 4.0) after going through
http://wiki.apache.org/solr/SolrCloud

May I know the right zookeeper url in 4.0 please?

Thanks
Naga


On 5/15/12 10:56 AM, "Ryan McKinley"  wrote:

>In 4.0, solr no longer uses JSP, so it is not enabled in the example
>setup.
>
>You can enable JSP in your servlet container using whatever method
>they provide.  For Jetty, using start.jar, you need to add the command
>line: java -jar start.jar -OPTIONS=jsp
>
>ryan
>
>
>
>On Mon, May 14, 2012 at 2:34 PM, Naga Vijayapuram 
>wrote:
>> Hello,
>>
>> How do I enable JSP support in Solr 4.0 ?
>>
>> Thanks
>> Naga



Re: Boosting on field empty or not

2012-05-15 Thread Donald Organ
Scratch that...it still seems to be boosting documents where the value of
the field is empty.


bq=regularprice:[0.01 TO *]^50

Results with bq set:


2.2172112
bhl-ltab-30
  


Results without bq set:


2.4847748
bhl-ltab-30
  


On Tue, May 15, 2012 at 12:40 PM, Donald Organ wrote:

> I have figured it out using your recommendation...I just had to give it a
> high enough boost.
>
> BTW its a float field
>
> On Tue, May 15, 2012 at 9:21 AM, Ahmet Arslan  wrote:
>
>> > The problem with what you provided is
>> > it is boosting ALL documents whether
>> > the field is empty or not
>>
>> Then all of your fields are non-empty? What is the type of your field?
>>
>>
>


Solr Caches

2012-05-15 Thread Rahul R
Hello,
I am trying to understand how I can size the caches for my solr powered
application. Some details on the index and application :
Solr Version : 1.3
JDK : 1.5.0_14 32 bit
OS : Solaris 10
App Server : Weblogic 10 MP1
Number of documents : 1 million
Total number of fields : 1000 (750 strings, 225 int/float/double/long, 25
boolean)
Number of fields on which faceting and filtering can be done : 400
Physical size of  index : 600MB
Number of unique values for a field : Ranges from 5 - 1000. Average of 150
-Xms and -Xmx vals for jvm : 3G
Expected number of concurrent users : 15
No sorting planned for now

Now I want to set appropriate values for the caches. I have put below some
of my understanding and questions about the caches. Please correct and
answer accordingly.
FilterCache:
As per the solr wiki, this is used to store an unordered list of Ids of
matching documents for an fq param.
So if a query contains two fq params, it will create two separate entries
for each of these fq params. The value of each entry is the list of ids of
all documents across the index that match the corresponding fq param. Each
entry is independent of any other entry.
A minimum size for filterCache could be (total number of fields * avg
number of unique values per field) ? Is this correct ? I have not enabled
.
Max physical size of the filter cache would be (size * avg byte size of a
document id * avg number of docs returned per fq param) ?

QueryResultsCache:
Used to store an ordered list of ids of the documents that match the most
commonly used searches. So if my query is something like
q=Status:Active&fq=Org:Apache&fq=Version:13, it will create one entry that
contains list of ids of documents that match this full query. Is this
correct ? How can I size my queryResultsCache ? Some entries from
solrconfig.xml :
50
200
Max physical size of the filterCache would be (size * avg byte size of a
document id * avg number of docs per query). Is this correct ?


documentCache:
Stores the documents that are stored in the index. So I do two searches
that return three documents each with 1 document being common between both
result sets. This will result in 5 entries in the documentCache for the 5
unique documents that have been returned for the two queries ? Is this
correct ? For sizing, SolrWiki states that "*The size for the documentCache
should always be greater than  * *".
Why do we need the max_concurrent_queries parameter here ? Is it when
max_results is much lesser than numDocs ? In my case, a q=*:*search is done
the first time the index is loaded. So, will setting documentCache size to
numDocs be correct ? Can this be like the max that I need to allocate ?
Max physical size of document cache would be (size * avg byte size of a
document in the index). Is this correct ?

Thank you

-Rahul


Re: Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-15 Thread Ravi Solr
I have already triple cross-checked  that all my clients are using
same version as the server which is 3.6

Thanks

Ravi Kiran

On Tue, May 15, 2012 at 2:09 PM, Ramesh K Balasubramanian
 wrote:
> I have seen similar errors before when the solr version and solrj version in 
> the client don't match.
>
> Best Regards,
> Ramesh


apostrophe / ayn / alif

2012-05-15 Thread Naomi Dushay
We are using the ICUFoldingFilterFactory with great success to fold diacritics 
so searches with and without the diacritics get the same results.

We recently discovered we have some Korean records that use an alif diacritic 
instead of an apostrophe, and this diacritic is NOT getting folded.   Has 
anyone experienced this for alif or ayn characters?   Do you have a solution?


- Naomi

Re: Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-15 Thread Ramesh K Balasubramanian
I have seen similar errors before when the solr version and solrj version in 
the client don't match.
 
Best Regards,
Ramesh

Re: Urgent! Highlighting not working as expected

2012-05-15 Thread TJ Tong
Thanks, Jack! I think you are right. But I also copied cr_firstname to text,
I assumed Solr would highlight cr_firstname if there is a match. I guess the
only solution is to copy all field to another field which is not tokenized.
Yes, it is "firstname", good catch! 

Thanks again!

TJ

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Urgent-Highlighting-not-working-as-expected-tp3983755p3983907.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-15 Thread Ravi Solr
Hello,
   Unfortunately it seems like I spoke too early. Today morning I
received the same error again even after disabling the iptables. The
weird thing is only one out of 6 or 7 queries fails as evidenced in
the stack traces below. The query below the stack trace gave a
'status=500' subsequent queries look fine


[#|2012-05-15T08:12:38.703-0400|SEVERE|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=32;_ThreadName=httpSSLWorkerThread-9001-8;_RequestID=9f54ea89-357a-4c1b-87a1-fbaacc9fd0ee;|org.apache.solr.common.SolrException
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:275)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:246)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:214)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:313)
at 
org.apache.catalina.core.StandardContextValve.invokeInternal(StandardContextValve.java:287)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:218)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
at com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:94)
at 
com.sun.enterprise.web.PESessionLockingStandardPipeline.invoke(PESessionLockingStandardPipeline.java:98)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:222)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:587)
at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:1093)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:166)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:587)
at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:1093)
at 
org.apache.coyote.tomcat5.CoyoteAdapter.service(CoyoteAdapter.java:291)
at 
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.invokeAdapter(DefaultProcessorTask.java:670)
at 
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.doProcess(DefaultProcessorTask.java:601)
at 
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.process(DefaultProcessorTask.java:875)
at 
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.executeProcessorTask(DefaultReadTask.java:365)
at 
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:285)
at 
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:221)
at 
com.sun.enterprise.web.connector.grizzly.TaskBase.run(TaskBase.java:269)
at 
com.sun.enterprise.web.connector.grizzly.ssl.SSLWorkerThread.run(SSLWorkerThread.java:111)
Caused by: java.lang.RuntimeException: Invalid version (expected 2,
but 60) or the data in not in 'javabin' format
at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
at 
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
at 
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:129)
at 
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:103)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   

Re: Highlight feature

2012-05-15 Thread Ramesh K Balasubramanian
That is the default response format. If you would like to change that, you 
could extend the search handler or post process the XML data. Another option 
would be to use the javabin (if your app is java based) and build xml the way 
your app would need.
 
Best Regards,
Ramesh

>

Re: - Solr 4.0 - How do I enable JSP support ? ...

2012-05-15 Thread Ryan McKinley
In 4.0, solr no longer uses JSP, so it is not enabled in the example setup.

You can enable JSP in your servlet container using whatever method
they provide.  For Jetty, using start.jar, you need to add the command
line: java -jar start.jar -OPTIONS=jsp

ryan



On Mon, May 14, 2012 at 2:34 PM, Naga Vijayapuram  wrote:
> Hello,
>
> How do I enable JSP support in Solr 4.0 ?
>
> Thanks
> Naga


Re: Boosting on field empty or not

2012-05-15 Thread Donald Organ
I have figured it out using your recommendation...I just had to give it a
high enough boost.

BTW its a float field

On Tue, May 15, 2012 at 9:21 AM, Ahmet Arslan  wrote:

> > The problem with what you provided is
> > it is boosting ALL documents whether
> > the field is empty or not
>
> Then all of your fields are non-empty? What is the type of your field?
>
>


Index an URL

2012-05-15 Thread Tolga

Hi,

I have a few questions, please bear with me:

1- I have a theory. nutch may be used to index to solr when we don't 
have access to URL's file system, while we can use curl when we do have 
access. Am I correct?
2- A tutorial I have been reading is talking about different levels of 
id. Is there such a thing (exid6, exid7 etc)?
3- When I use curl 
"http://localhost:8983/solr/update/extract?literal.id=exid7&commit=true"; 
-F "myfile=@serialized-form.html", I get ERROR: [doc=exid7] unknown 
field 'ignored_link'. Is this something exid7 gives me? Where does 
this field ignored_link come from? Do I need to add all these fields to 
schema.xml in order not to get such error? What is the safest way?


Regards,


Re: Urgent! Highlighting not working as expected

2012-05-15 Thread Jack Krupansky
In the case of text:"G-Money", the term is analyzed by Solr into the phrase 
"g money", which matches in the text field, but will not match for a string 
field containing the literal text "G-Money". But when you query 
cr_fristname:"G-Money", the term is not tokenized by the Solr analyzer 
because it is a value for a "string" field, and a literal match occurs in 
the string field "cr_fristname". I think that fully accounts for the 
behavior you see.


You might consider having a cr_fristname_text field which is tokenized text 
with a copyField from cr_fristname that fully supports highlighting of text 
terms.


BTW, I presume that should be "first" name, not "frist" name.

-- Jack Krupansky

-Original Message- 
From: TJ Tong

Sent: Tuesday, May 15, 2012 11:15 AM
To: solr-user@lucene.apache.org
Subject: Re: Urgent! Highlighting not working as expected

Hi Jack,

Thanks for your reply. I did not specify dismax when query with highlighting
enabled: q=text:"G-Money"&hl=true&hl.fl=*, that was the whole query string I
sent. What puzzled me is that the "string" field "cr_firstname" was copied
to text, but it was not highlighted. But if I use
q=cr_fristname:"G-Money"&hl=true&hl.fl=*, it will be highlighted. I attached
my solrconfig.xml here, could you please take a look? Thanks again!
http://lucene.472066.n3.nabble.com/file/n3983883/solrconfig.xml
solrconfig.xml

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Urgent-Highlighting-not-working-as-expected-tp3983755p3983883.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Highlight feature

2012-05-15 Thread TJ Tong
I am also working on highlighting. I don't think so. And the ids in the
highlighting part are the ids of the docs retrieved.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Highlight-feature-tp3983875p3983887.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: need help with getting exact matches to score higher

2012-05-15 Thread Tanguy Moal
Hello,
>From the response you pasted here, it looks like the field
"itemNoExactMatchStr"
never matched.

Can you try matching in that field only and ensure you have matches ? Given
the ^30 boost, you should have high scores on this field...

Hope this helps,

--
Tanguy

2012/5/15 geeky2 

> Hello all,
>
>
> i am trying to tune our core for exact matches on a single field (itemNo)
> and having issues getting it to work.
>
> in addition - i need help understanding the output from debugQuery=on where
> it presents the scoring.
>
> my goal is to get exact matches to arrive at the top of the results.
> however - what i am seeing is non-exact matches arrive at the top of the
> results with MUCH higher scores.
>
>
>
> // from schema.xml - i am copying itemNo in to the string field for use in
> boosting
>
>   stored="false"/>
>  
>
> // from solrconfig.xml - i have the boost set for my special exact match
> field and the sorting on score desc.
>
>   class="solr.SearchHandler" default="false">
>
>  edismax
>  all
>  10
>  *itemNoExactMatchStr^30 itemNo^.9 divProductTypeDesc^.8
> brand^.5*
>  *:*
> * score desc*
>  true
>  itemDescFacet
>  brandFacet
>  divProductTypeIdFacet
>
>
>
>
>
>  
>
>
>
> // analysis output from debugQuery=on
>
> here you can see that the top socre for itemNo:9030 is a part that does not
> start with 9030.
>
> the entries below (there are 4) all have exact matches - but they rank
> below
> this part - ???
>
>
>
> 
> 0.585678 = (MATCH) max of:
>  0.585678 = (MATCH) weight(itemNo:9030^0.9 in 582979), product of:
>0.021552926 = queryWeight(itemNo:9030^0.9), product of:
>  0.9 = boost
>  10.270785 = idf(docFreq=55, maxDocs=594893)
>  0.0023316324 = queryNorm
>27.173943 = (MATCH) fieldWeight(itemNo:9030 in 582979), product of:
>  2.6457512 = tf(termFreq(itemNo:9030)=7)
>  10.270785 = idf(docFreq=55, maxDocs=594893)
>  1.0 = fieldNorm(field=itemNo, doc=582979)
> 
>
>
>
> 
> 0.22136548 = (MATCH) max of:
>  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 499864), product of:
>0.021552926 = queryWeight(itemNo:9030^0.9), product of:
>  0.9 = boost
>  10.270785 = idf(docFreq=55, maxDocs=594893)
>  0.0023316324 = queryNorm
>10.270785 = (MATCH) fieldWeight(itemNo:9030 in 499864), product of:
>  1.0 = tf(termFreq(itemNo:9030)=1)
>  10.270785 = idf(docFreq=55, maxDocs=594893)
>  1.0 = fieldNorm(field=itemNo, doc=499864)
> 
>
> 
> 0.22136548 = (MATCH) max of:
>  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 538826), product of:
>0.021552926 = queryWeight(itemNo:9030^0.9), product of:
>  0.9 = boost
>  10.270785 = idf(docFreq=55, maxDocs=594893)
>  0.0023316324 = queryNorm
>10.270785 = (MATCH) fieldWeight(itemNo:9030 in 538826), product of:
>  1.0 = tf(termFreq(itemNo:9030)=1)
>  10.270785 = idf(docFreq=55, maxDocs=594893)
>  1.0 = fieldNorm(field=itemNo, doc=538826)
> 
>
> 
> 0.22136548 = (MATCH) max of:
>  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 544313), product of:
>0.021552926 = queryWeight(itemNo:9030^0.9), product of:
>  0.9 = boost
>  10.270785 = idf(docFreq=55, maxDocs=594893)
>  0.0023316324 = queryNorm
>10.270785 = (MATCH) fieldWeight(itemNo:9030 in 544313), product of:
>  1.0 = tf(termFreq(itemNo:9030)=1)
>  10.270785 = idf(docFreq=55, maxDocs=594893)
>  1.0 = fieldNorm(field=itemNo, doc=544313)
> 
>
> 
> 0.22136548 = (MATCH) max of:
>  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 544657), product of:
>0.021552926 = queryWeight(itemNo:9030^0.9), product of:
>  0.9 = boost
>  10.270785 = idf(docFreq=55, maxDocs=594893)
>  0.0023316324 = queryNorm
>10.270785 = (MATCH) fieldWeight(itemNo:9030 in 544657), product of:
>  1.0 = tf(termFreq(itemNo:9030)=1)
>  10.270785 = idf(docFreq=55, maxDocs=594893)
>  1.0 = fieldNorm(field=itemNo, doc=544657)
> 
>
>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/need-help-with-getting-exact-matches-to-score-higher-tp3983882.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Problem with AND clause in multi core search query

2012-05-15 Thread Erick Erickson
Right, but for that to work, there's an implicit connection between
the docs in core1 and core0, I assume provided by 123456 as
a foreign key or something. There's nothing automatically built
in like this in Solr 1.4 (joins come close, but those are trunk).

Whenever you try to make Solr act just like a database, you're
probably doing something you shouldn't. Solr is a very good search
engine, but it's not a RDBMS and shouldn't be asked to behave like
one.

In your case, consider de-normalizing the data and indexing all the related
data in a single document, even if it means repeating the data. Sometimes
this requires some judicious creativity, but it's the first thing I'd look at.

Best
Erick

On Tue, May 15, 2012 at 10:54 AM, ravicv  wrote:
> Hi Erick ,
>
> My Schema is as follows
>
>  />
>   
>   
>   
>
> My data which i am indexing in core0 is
> id:1,  value:'123456',   column1:'A',    column2:'null'
> id:2,  value:'1234567895252',  column1:'B',    column2:'null'
>
> My data which i am indexing in core1 is
> id:3,  value:'123456',  column1:'null',  column2:'C'
>
> Now my query is
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=column1:"A";
> AND column2:"C"
>
> Response: No data
>
> In database we can achieve this by query querying separately
>  as follows
>
> select value from core0 where column1='A'
> intersect
> select value from core0 where column1='C'
>
> Same scenario i am trying to implement in my multi core SOLR setup. But i am
> unable to do so.
> Please let me know what should i do to implement this type of scenario in
> SOLR.
>
> I am using SOLR 1.4 version.
>
> Thanks
> Ravi
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Problem-with-AND-clause-in-multi-core-search-query-tp3983800p3983881.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Urgent! Highlighting not working as expected

2012-05-15 Thread TJ Tong
Hi Jack,

Thanks for your reply. I did not specify dismax when query with highlighting
enabled: q=text:"G-Money"&hl=true&hl.fl=*, that was the whole query string I
sent. What puzzled me is that the "string" field "cr_firstname" was copied
to text, but it was not highlighted. But if I use
q=cr_fristname:"G-Money"&hl=true&hl.fl=*, it will be highlighted. I attached
my solrconfig.xml here, could you please take a look? Thanks again!
http://lucene.472066.n3.nabble.com/file/n3983883/solrconfig.xml
solrconfig.xml 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Urgent-Highlighting-not-working-as-expected-tp3983755p3983883.html
Sent from the Solr - User mailing list archive at Nabble.com.


need help with getting exact matches to score higher

2012-05-15 Thread geeky2
Hello all,


i am trying to tune our core for exact matches on a single field (itemNo)
and having issues getting it to work.  

in addition - i need help understanding the output from debugQuery=on where
it presents the scoring.

my goal is to get exact matches to arrive at the top of the results. 
however - what i am seeing is non-exact matches arrive at the top of the
results with MUCH higher scores.



// from schema.xml - i am copying itemNo in to the string field for use in
boosting

  
  

// from solrconfig.xml - i have the boost set for my special exact match
field and the sorting on score desc.

  

  edismax
  all
  10
  *itemNoExactMatchStr^30 itemNo^.9 divProductTypeDesc^.8
brand^.5*
  *:*
 * score desc*
  true
  itemDescFacet
  brandFacet
  divProductTypeIdFacet





  



// analysis output from debugQuery=on

here you can see that the top socre for itemNo:9030 is a part that does not
start with 9030.

the entries below (there are 4) all have exact matches - but they rank below
this part - ???



2TTZ9030C1000A* ">
0.585678 = (MATCH) max of:
  0.585678 = (MATCH) weight(itemNo:9030^0.9 in 582979), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
27.173943 = (MATCH) fieldWeight(itemNo:9030 in 582979), product of:
  2.6457512 = tf(termFreq(itemNo:9030)=7)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=582979)




9030*   ">
0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 499864), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 499864), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=499864)


9030   *">
0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 538826), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 538826), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=538826)


9030   *">
0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 544313), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 544313), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=544313)


9030   *">
0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 544657), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 544657), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=544657)








--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-with-getting-exact-matches-to-score-higher-tp3983882.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problem with AND clause in multi core search query

2012-05-15 Thread ravicv
Hi Erick ,

My Schema is as follows


   
   


My data which i am indexing in core0 is 
id:1,  value:'123456',   column1:'A',column2:'null'
id:2,  value:'1234567895252',  column1:'B',column2:'null'

My data which i am indexing in core1 is 
id:3,  value:'123456',  column1:'null',  column2:'C'

Now my query is 
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=column1:"A";
AND column2:"C"

Response: No data

In database we can achieve this by query querying separately 
 as follows

select value from core0 where column1='A'
intersect
select value from core0 where column1='C'

Same scenario i am trying to implement in my multi core SOLR setup. But i am
unable to do so.
Please let me know what should i do to implement this type of scenario in
SOLR.

I am using SOLR 1.4 version.

Thanks 
Ravi 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-with-AND-clause-in-multi-core-search-query-tp3983800p3983881.html
Sent from the Solr - User mailing list archive at Nabble.com.


Highlight feature

2012-05-15 Thread anarchos78
Hello friends

I have noticed that the highlighted term of a query are returned in a second
xml struct(named "highlighting"). Is it possible to return the highlighted
terms into the doc field. I don't need the solr generated ids of the
highlighted field.

Thanks,
Tom

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Highlight-feature-tp3983875.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Editing long Solr URLs - Chrome Extension

2012-05-15 Thread Erick Erickson
I think I put one up already, but in case I messed up github, complex
params like the fq here:

http://localhost:8983/solr/select?q=:&fq={!geofilt sfield=store
pt=52.67,7.30 d=5}

aren't properly handled.

But I'm already using it occasionally

Erick

On Tue, May 15, 2012 at 10:02 AM, Amit Nithian  wrote:
> Jan
>
> Thanks for your feedback! If possible can you file these requests on the
> github page for the extension so I can work on them? They sound like great
> ideas and I'll try to incorporate all of them in future releases.
>
> Thanks
> Amit
> On May 11, 2012 9:57 AM, "Jan Høydahl"  wrote:
>
>> I've been testing
>> https://chrome.google.com/webstore/detail/mbnigpeabbgkmbcbhkkbnlidcobbapff?hl=enbut
>>  I don't think it's great.
>>
>> Great work on this one. Simple and straight forward. A few wishes:
>> * Sticky mode? This tool would make sense in a sidebar, to do rapid
>> refinements
>> * If you edit a value and click "TAB", it is not updated :(
>> * It should not be necessary to URLencode all non-ascii chars - why not
>> leave colon, caret (^) etc as is, for better readability?
>> * Some param values in Solr may be large, such as "fl", "qf" or "bf".
>> Would be nice if the edit box was multi-line, or perhaps adjusts to the
>> size of the content
>>
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.facebook.com/Cominvent
>> Solr Training - www.solrtraining.com
>>
>> On 11. mai 2012, at 07:32, Amit Nithian wrote:
>>
>> > Hey all,
>> >
>> > I don't know about you but most of the Solr URLs I issue are fairly
>> > lengthy full of parameters on the query string and browser location
>> > bars aren't long enough/have multi-line capabilities. I tried to find
>> > something that does this but couldn't so I wrote a chrome extension to
>> > help.
>> >
>> > Please check out my blog post on the subject and please let me know if
>> > something doesn't work or needs improvement. Of course this can work
>> > for any URL with a query string but my motivation was to help edit my
>> > long Solr URLs.
>> >
>> >
>> http://hokiesuns.blogspot.com/2012/05/manipulating-urls-with-long-query.html
>> >
>> > Thanks!
>> > Amit
>>
>>


RE: Issue in Applying patch file

2012-05-15 Thread Dyer, James
SOLR-3430 is already applied to the latest 3.6 and 4.x (trunk) source code.  Be 
sure you have sources from May 7, 2012 or later (for 3.6 this is SVN r1335205 + 
; for trunk it is SVN r1335196 + )  No patches are needed.

About the "modern compiler" error, make sure you're running a 1.6 or 1.7 JDK 
(the default JDK on some linux distributions is often inadequate) Issue "javac 
-version" from the command line as an insanity check.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: mechravi25 [mailto:mechrav...@yahoo.co.in] 
Sent: Tuesday, May 15, 2012 6:54 AM
To: solr-user@lucene.apache.org
Subject: Issue in Applying patch file

Hi,


We have checked out the latest version of Solr source code from svn. We are
trying to apply the following patch file to it.

 https://issues.apache.org/jira/browse/SOLR-3430

While applying the patch file using eclipse (i.e. using team-->apply patch
options), we are getting cross marks for certain java files and its getting
updated for the following java file alone and we are able to see the patch
file changes for this alone.

solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/TestThreaded.java

Why is that its not getting applied for the other set of java files which is
present in the patch file and sometimes, we are getting "file does not
exist" error even if the corresponding files are present.

And also, when I try to ant build it after applying the patch, Im getting
the following error

common-build.xml:949: Error starting modern compiler

Can you tell me If Im missing out anything? Can you please guide me on this?

Thanks in advance

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Issue-in-Applying-patch-file-tp3983842.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Boosting on field empty or not

2012-05-15 Thread Jack Krupansky
Let's go back to this step where things look correct, but we ran into the 
edismax bug which requires that you put a space between each left 
parenthesis and field name.


First, verify that you are using edismax or not.

Then, change:

&q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)&sort=score 
desc


to

&q=chairs AND ( regularprice:*^5 OR ( *:* -regularprice:*)^0.5)&sort=score 
desc


(Note the space after each "(".)

And make sure to uuencode your spaces as "+" or "%20".

Also, try this to verify whether you really have chairs without prices:

&q=chairs AND ( *:* -regularprice:*)&sort=score desc

(Note that space after "(".)

And for sanity, try this as well:

&q=chairs AND ( -regularprice:*)&sort=score desc

(Again, note that space after "(".)

Those two queries should give identical results.

Finally, technically you should be able to use "*" or "[* TO *]" to match 
all values or negate them to match all documents without a value in a field, 
but try both to see that they do return the identical set of documents.


-- Jack Krupansky

-Original Message- 
From: Donald Organ

Sent: Monday, May 14, 2012 4:19 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

&q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)&sort=score 
desc



Same effect.


On Mon, May 14, 2012 at 4:12 PM, Jack Krupansky 
wrote:



Change the second boost to 0.5 to de-boost doc that are missing the field
value. You had them the same.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 4:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK it looks like the query change is working but it looks like it boosting
everything even documents that have that field empty

On Mon, May 14, 2012 at 3:41 PM, Donald Organ wrote:

 OK i must be missing something:



defType=edismax&start=0&rows=**24&facet=true&qf=nameSuggest^**10 name^10
codeTXT^2 description^1 brand_search^0 cat_search^10&spellcheck=true&**
spellcheck.collate=true&**spellcheck.q=chairs&facet.**
mincount=1&fl=code,score&q=**chairs AND (regularprice:*^5 OR (*:*
-regularprice:*)^5)&sort=score desc


On Mon, May 14, 2012 at 3:36 PM, Jack Krupansky 
**wrote:

 "(*:* -regularprice:*)5" should be "(*:* -regularprice:*)^0.5" - the

missing boost operator.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 3:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

Still doesnt appear to be working.  Here is the full Query string:


defType=edismax&start=0&rows=24&facet=true&qf=nameSuggest^10
name^10
codeTXT^2 description^1 brand_search^0
cat_search^10&spellcheck=true&spellcheck.collate=true&**
spellcheck.q=chairs&facet.mincount=1&fl=code,score&q=chairs
AND (regularprice:*^5 OR (*:* -regularprice:*)5)


On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky 
**wrote:

 Sorry, make that:



&q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)

I forgot that pure negative queries are broken again, so you need the
*:*
in there.

I noticed that you second boost operator was missing as well.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 3:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK i just tried:

&q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)


And that gives me 0 results


On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky <
j...@basetechnology.com
>*
*wrote:

 foo AND (field:*^2.0 OR (-field:*)^0.5)


So, if a doc has anything in the field, it gets boosted, and if the 
doc

does not have anything in the field, de-boost it. Choose the boost
factors
to suit your desired boosting effect.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK maybe i need to describe this a little more.

Basically I want documents that have a given field populated to have a
higher score than the documents that dont.  So if you search for foo I
want
documents that contain foo, but i want the documents that have field a
populated to have a higher score...

Is there a way to do this?



On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky <
j...@basetechnology.com
>*
*wrote:

 In a query or filter query you can write +field:* to require that a
field

 be populated or +(-field:*) to require that it not be populated



-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:10 PM
To: solr-user
Subject: Boosting on field empty or not

Is there a way to boost a document based on whether the field is 
empty

or
not.  I am looking to boost documents that have a specific field
populated.



















Re: Editing long Solr URLs - Chrome Extension

2012-05-15 Thread Amit Nithian
Jan

Thanks for your feedback! If possible can you file these requests on the
github page for the extension so I can work on them? They sound like great
ideas and I'll try to incorporate all of them in future releases.

Thanks
Amit
On May 11, 2012 9:57 AM, "Jan Høydahl"  wrote:

> I've been testing
> https://chrome.google.com/webstore/detail/mbnigpeabbgkmbcbhkkbnlidcobbapff?hl=enbut
>  I don't think it's great.
>
> Great work on this one. Simple and straight forward. A few wishes:
> * Sticky mode? This tool would make sense in a sidebar, to do rapid
> refinements
> * If you edit a value and click "TAB", it is not updated :(
> * It should not be necessary to URLencode all non-ascii chars - why not
> leave colon, caret (^) etc as is, for better readability?
> * Some param values in Solr may be large, such as "fl", "qf" or "bf".
> Would be nice if the edit box was multi-line, or perhaps adjusts to the
> size of the content
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.facebook.com/Cominvent
> Solr Training - www.solrtraining.com
>
> On 11. mai 2012, at 07:32, Amit Nithian wrote:
>
> > Hey all,
> >
> > I don't know about you but most of the Solr URLs I issue are fairly
> > lengthy full of parameters on the query string and browser location
> > bars aren't long enough/have multi-line capabilities. I tried to find
> > something that does this but couldn't so I wrote a chrome extension to
> > help.
> >
> > Please check out my blog post on the subject and please let me know if
> > something doesn't work or needs improvement. Of course this can work
> > for any URL with a query string but my motivation was to help edit my
> > long Solr URLs.
> >
> >
> http://hokiesuns.blogspot.com/2012/05/manipulating-urls-with-long-query.html
> >
> > Thanks!
> > Amit
>
>


Re: Boosting on field empty or not

2012-05-15 Thread Ahmet Arslan
> > The problem with what you
> provided is
> > it is boosting ALL documents whether
> > the field is empty or not
> 
> Then all of your fields are non-empty? What is the type of
> your field?

How do you feed your documents to solr? My be you are indexing empty string? Is 
your field indexed="true"? 

http://wiki.apache.org/solr/SolrQuerySyntax#Differences_From_Lucene_Query_Parser

"-field:[* TO *] finds all documents without a value for field"

Another approach is to use default="SOMETHING" in your field definition. 
(schema.xml)   


Then you can use field:SOMETHING to retrieve empty fields. 
+*:* -field:SOMETHING retrieves non-empty documents.



Re: adding an OR to a fq makes some doc that matched not match anymore

2012-05-15 Thread jmlucjav
oh yeah, forgot about negatives and *:*...
thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/adding-an-OR-to-a-fq-makes-some-doc-that-matched-not-match-anymore-tp3983775p3983863.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr tmp working directory

2012-05-15 Thread G.Long

Thank you :)

Gary

Le 15/05/2012 15:27, Jack Krupansky a écrit :
Solr is probably simply using the Java JVM default. Set the 
java.io.tmpdir system property. Something equivalent to the following:


java -Djava.io.tmpdir=/mytempdir ...

On Windows you can set the TMP environment variable.

-- Jack Krupansky

-Original Message- From: G.Long
Sent: Tuesday, May 15, 2012 9:04 AM
To: solr-user@lucene.apache.org
Subject: Solr tmp working directory

Hi :)

I'm using SolrJ to index documents. I noticed that during the indexing
process, .tmp files are created in my /tmp folder. These files contain
the xml commands  for the documents I add to the index.

Can I change this folder in Solr config and where is it?

Thanks,
Gary




Re: Solr tmp working directory

2012-05-15 Thread Jack Krupansky
Solr is probably simply using the Java JVM default. Set the java.io.tmpdir 
system property. Something equivalent to the following:


java -Djava.io.tmpdir=/mytempdir ...

On Windows you can set the TMP environment variable.

-- Jack Krupansky

-Original Message- 
From: G.Long

Sent: Tuesday, May 15, 2012 9:04 AM
To: solr-user@lucene.apache.org
Subject: Solr tmp working directory

Hi :)

I'm using SolrJ to index documents. I noticed that during the indexing
process, .tmp files are created in my /tmp folder. These files contain
the xml commands  for the documents I add to the index.

Can I change this folder in Solr config and where is it?

Thanks,
Gary 



Re: Boosting on field empty or not

2012-05-15 Thread Ahmet Arslan
> The problem with what you provided is
> it is boosting ALL documents whether
> the field is empty or not

Then all of your fields are non-empty? What is the type of your field?



Re: Boosting on field empty or not

2012-05-15 Thread Donald Organ
The problem with what you provided is it is boosting ALL documents whether
the field is empty or not

On Tue, May 15, 2012 at 3:52 AM, Ahmet Arslan  wrote:

> > Basically I want documents that have a given field populated
> > to have a
> > higher score than the documents that dont.  So if you
> > search for foo I want
> > documents that contain foo, but i want the documents that
> > have field a
> > populated to have a higher score...
>
>
> Hi Donald,
>
> Since you are using edismax, it is better to use bq (boosting query) for
> this.
>
> bq=reqularprice:[* TO *]^50
>
> http://wiki.apache.org/solr/DisMaxQParserPlugin#bq_.28Boost_Query.29
>
> defType=edismax&qf=nameSuggest^10 name^10 codeTXT^2 description^1
> brand_search^0 cat_search^10&q=chairs&bq=reqularprice:[* TO *]^50
>
>


Re: Show a portion of searchable text in Solr

2012-05-15 Thread Jack Krupansky

See the "/browse" request handler in the example config.

Only stored fields will be highlighted.

-- Jack Krupansky

-Original Message- 
From: Shameema Umer 
Sent: Tuesday, May 15, 2012 2:59 AM 
To: solr-user@lucene.apache.org 
Subject: Re: Show a portion of searchable text in Solr 


Can somebody tell me where should I place the highlighting parameters, when
I did on the query, it is not working.
&hl=true&hl.requireFieldMatch=true&hl.fl=*

FYI: I am new to solr. My aim  is to have emphasis tags on the queried
words and need to display only the query relevant snippet of the 

Thanks
Shameema





On Mon, May 14, 2012 at 1:18 PM, Ahmet Arslan  wrote:


> I have indexed very large documents, In some cases these
> documents has
> 100.000 characters. Is there a way to return a portion of
> the documents
> (lets say the 300 first characters) when i am querying
> "Solr"?. Is there any
> attribute to set in the schema.xml or solrconfig.xml to
> achieve this?

I have a set-up with very large documents too. Here is two different
solutions that I have used in the past:

1) Use highlighting with hl.alternateField and hl.maxAlternateFieldLength
http://wiki.apache.org/solr/HighlightingParameters

2) Create an extra field (indexed="false" and stored="true") using
copyField just for display purposes. (&fl=shortField)


http://wiki.apache.org/solr/SchemaXml#Copy_Fields

Also, didn't used by myself yet but I *think* this can be accomplished by
using a custom Transformer too.
http://wiki.apache.org/solr/DocTransformers



Solr tmp working directory

2012-05-15 Thread G.Long

Hi :)

I'm using SolrJ to index documents. I noticed that during the indexing 
process, .tmp files are created in my /tmp folder. These files contain 
the xml commands  for the documents I add to the index.


Can I change this folder in Solr config and where is it?

Thanks,
Gary


Re: simple query help

2012-05-15 Thread Jack Krupansky
Yes, the parentheses are needed to prioritize the operator precedence (do 
the ANDs and then OR those results.) And, add a space after both left 
parentheses to account for the edismax bug.


(https://issues.apache.org/jira/browse/SOLR-3377)

-- Jack Krupansky

-Original Message- 
From: András Bártházi

Sent: Tuesday, May 15, 2012 6:50 AM
To: solr-user@lucene.apache.org
Subject: Re: simple query help

Hi,

You should use parantheses, have you tried that?
q=(skcode:2021051 and flength:368.0) or (skcode:2021049 and
ent_no:1040970907)

http://robotlibrarian.billdueber.com/solr-and-boolean-operators/

Bye,
 Andras

2012/5/15 Peter Kirk 


Hi

Can someone please give me some help with a simple query.

If I search
q=skcode:2021051 and flength:368.0

I get 1 document returned (doc A)

If I search
q=skcode:2021049 and ent_no:1040970907

I get 1 document returned (doc B)


But if I search
q=skcode:2021051 and flength:368.0 or skcode:2021049 and ent_no:1040970907

I get no documents returned.

Shouldn't I get both docA and docB?

Thanks,
Peter






Re: simple query help

2012-05-15 Thread Jack Krupansky
By removing the defType you reverted to using the traditional Solr/Lucene 
query parser which supports the particular query syntax you used (as long as 
"AND" is in upper-case) and without the parenthesis bug of edismax.


-- Jack Krupansky

-Original Message- 
From: Peter Kirk

Sent: Tuesday, May 15, 2012 8:23 AM
To: solr-user@lucene.apache.org
Subject: RE: simple query help

Hi

If I understand the terms correctly, the search-handler was configured to 
use "edismax".


The start of the configuration in the solrconfig.xml looks like this:


   
 edismax

In any case, when I commented-out the "deftype" entry, and restarted the 
solr webapp, things began to function as I expected.


But whether or not it was simply the act of restarting - I'm not sure. (I 
had also found out that "AND " and "OR" should be written in uppercase, but 
this made no difference until after I had restarted).



Thanks for your time,
Peter



-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com]
Sent: 15. maj 2012 13:25
To: solr-user@lucene.apache.org
Subject: RE: simple query help


It doesn't make a difference. But now I'm thinking there's something
completely odd - and I wonder if it's necessary to use a special
search-handler to achieve what  I want.

For example, if I execute
q=(skcode:2021051 AND flength:368.0)

I get no results. If I omit the parentheses, I get 1 result.
(Let alone trying to combine several Boolean clauses).


Which query parser are you using?



RE: simple query help

2012-05-15 Thread Ahmet Arslan
> But whether or not it was simply the act of restarting - I'm
> not sure. (I had also found out that "AND " and "OR" should
> be written in uppercase, but this made no difference until
> after I had restarted).

By the way, there is a control parameter for this. 

"lowercaseOperators A Boolean parameter indicating if lowercase "and" and 
"or" should be treated the same as operators "AND" and "OR". "

http://lucidworks.lucidimagination.com/display/solr/The+Extended+DisMax+Query+Parser


Re: simple query help

2012-05-15 Thread Jack Krupansky
Are you using the edismax query parser (which permits lower case "and" and 
"or" operators)? If so, there is a bug with parenthesized sub-queries. If 
you have a left parenthesis immediately before a field name (which you do in 
this case) the query fails. The short-term workaround is to place a space 
between the left parenthesis and the field name.


See:
https://issues.apache.org/jira/browse/SOLR-3377

-- Jack Krupansky

-Original Message- 
From: Peter Kirk

Sent: Tuesday, May 15, 2012 7:04 AM
To: solr-user@lucene.apache.org
Subject: RE: simple query help

Hi - thanks for the response. Yes I have tried with parentheses, to group as 
you suggest.


It doesn't make a difference. But now I'm thinking there's something 
completely odd - and I wonder if it's necessary to use a special 
search-handler to achieve what  I want.


For example, if I execute
q=(skcode:2021051 AND flength:368.0)

I get no results. If I omit the parentheses, I get 1 result. (Let alone 
trying to combine several Boolean clauses).


/Peter


-Original Message-
From: András Bártházi [mailto:and...@barthazi.hu]
Sent: 15. maj 2012 12:51
To: solr-user@lucene.apache.org
Subject: Re: simple query help

Hi,

You should use parantheses, have you tried that?
q=(skcode:2021051 and flength:368.0) or (skcode:2021049 and
ent_no:1040970907)

http://robotlibrarian.billdueber.com/solr-and-boolean-operators/

Bye,
 Andras

2012/5/15 Peter Kirk 


Hi

Can someone please give me some help with a simple query.

If I search
q=skcode:2021051 and flength:368.0

I get 1 document returned (doc A)

If I search
q=skcode:2021049 and ent_no:1040970907

I get 1 document returned (doc B)


But if I search
q=skcode:2021051 and flength:368.0 or skcode:2021049 and ent_no:1040970907

I get no documents returned.

Shouldn't I get both docA and docB?

Thanks,
Peter






RE: simple query help

2012-05-15 Thread Peter Kirk
Hi

If I understand the terms correctly, the search-handler was configured to use 
"edismax".

The start of the configuration in the solrconfig.xml looks like this:



  edismax

In any case, when I commented-out the "deftype" entry, and restarted the solr 
webapp, things began to function as I expected.

But whether or not it was simply the act of restarting - I'm not sure. (I had 
also found out that "AND " and "OR" should be written in uppercase, but this 
made no difference until after I had restarted).


Thanks for your time,
Peter



-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com] 
Sent: 15. maj 2012 13:25
To: solr-user@lucene.apache.org
Subject: RE: simple query help

> It doesn't make a difference. But now I'm thinking there's something 
> completely odd - and I wonder if it's necessary to use a special 
> search-handler to achieve what  I want.
> 
> For example, if I execute
> q=(skcode:2021051 AND flength:368.0)
> 
> I get no results. If I omit the parentheses, I get 1 result.
> (Let alone trying to combine several Boolean clauses).

Which query parser are you using?




Re: - Solr 4.0 - How do I enable JSP support ? ...

2012-05-15 Thread Stefan Matheis
Afaik we disabled JSP-Functionality in SOLR-3159 while upgrading Jetty .. 



On Tuesday, May 15, 2012 at 1:44 PM, Erick Erickson wrote:

> What do you mean "jsp support"? What is it you're trying to do
> with jsp? What servelet container are you using? Details matter.
> 
> Best
> Erick
> 
> On Mon, May 14, 2012 at 5:34 PM, Naga Vijayapuram  (mailto:nvija...@tibco.com)> wrote:
> > Hello,
> > 
> > How do I enable JSP support in Solr 4.0 ?
> > 
> > Thanks
> > Naga
> 





Re: Problem with AND clause in multi core search query

2012-05-15 Thread Erick Erickson
I really don't understand what you're trying to achieve.

"query : column1:"A" should searched in core0 and column2:"B" should be
searched in core1 and later the results from both queries should use
condition AND and give final response.?"

core1 and core0 are completely separate cores, with separate documents.
The only relationship between documents in the two cores is that they
should conform to the same schema since you're using shards. So saying
that your query should search in just one column in each core then AND the
results really doesn't make any sense to me.

I suspect there are some assumptions you're not explicitly stating about the
relationship between documents in separate cores that would help here...

Best
Erick

On Tue, May 15, 2012 at 3:07 AM, ravicv  wrote:
> Thanks Tommaso .
>
> Could you please tell me is their any way to get this scenario to get
> worked?
>
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=column1:"A";
> AND column2:"B"
>
> Is their any way we can achieve this below scenario
>
> query : column1:"A" should searched in core0 and column2:"B" should be
> searched in core1 and later the results from both queries should use
> condition AND and give final response.?
>
> since both will return common field as response.
>
> For reference my schema is :
>
>    required="true" />
>   
>   
>   
>
> Thanks
> Ravi
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Problem-with-AND-clause-in-multi-core-search-query-tp3983800p3983806.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: simple query help

2012-05-15 Thread Péter Király
Hi,

it is AND (uppercase) not and (smallcase) (and OR instead of or).

Regards,
Peter

2012/5/15 András Bártházi :
> Hi,
>
> You should use parantheses, have you tried that?
> q=(skcode:2021051 and flength:368.0) or (skcode:2021049 and
> ent_no:1040970907)
>
> http://robotlibrarian.billdueber.com/solr-and-boolean-operators/
>
> Bye,
>  Andras
>
> 2012/5/15 Peter Kirk 
>
>> Hi
>>
>> Can someone please give me some help with a simple query.
>>
>> If I search
>> q=skcode:2021051 and flength:368.0
>>
>> I get 1 document returned (doc A)
>>
>> If I search
>> q=skcode:2021049 and ent_no:1040970907
>>
>> I get 1 document returned (doc B)
>>
>>
>> But if I search
>> q=skcode:2021051 and flength:368.0 or skcode:2021049 and ent_no:1040970907
>>
>> I get no documents returned.
>>
>> Shouldn't I get both docA and docB?
>>
>> Thanks,
>> Peter
>>
>>



-- 
Péter Király
eXtensible Catalog
http://eXtensibleCatalog.org
http://drupal.org/project/xc


Issue in Applying patch file

2012-05-15 Thread mechravi25
Hi,


We have checked out the latest version of Solr source code from svn. We are
trying to apply the following patch file to it.

 https://issues.apache.org/jira/browse/SOLR-3430

While applying the patch file using eclipse (i.e. using team-->apply patch
options), we are getting cross marks for certain java files and its getting
updated for the following java file alone and we are able to see the patch
file changes for this alone.

solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/TestThreaded.java

Why is that its not getting applied for the other set of java files which is
present in the patch file and sometimes, we are getting "file does not
exist" error even if the corresponding files are present.

And also, when I try to ant build it after applying the patch, Im getting
the following error

common-build.xml:949: Error starting modern compiler

Can you tell me If Im missing out anything? Can you please guide me on this?

Thanks in advance

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Issue-in-Applying-patch-file-tp3983842.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: document cache

2012-05-15 Thread Erick Erickson
Yes. In fact, all the caches get flushed on every commit/replication cycle.

Some of the caches get autowarmed when a new searcher is opened,
which happens...you guessed it...every time a commit/replication happens.

Best
Erick

On Tue, May 15, 2012 at 1:32 AM, shinkanze  wrote:
>  hi ,
>
> I want to know the internal mechanism how document cache works .
>
> specifically its flushing cycle ...
>
> i.e does it gets flushed  on every commit /replication .
>
> regards
>
> Rajat Rastogi
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/document-cache-tp3983796.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: - Solr 4.0 - How do I enable JSP support ? ...

2012-05-15 Thread Erick Erickson
What do you mean "jsp support"? What is it you're trying to do
with jsp? What servelet container are you using? Details matter.

Best
Erick

On Mon, May 14, 2012 at 5:34 PM, Naga Vijayapuram  wrote:
> Hello,
>
> How do I enable JSP support in Solr 4.0 ?
>
> Thanks
> Naga


RE: simple query help

2012-05-15 Thread Ahmet Arslan
> It doesn't make a difference. But now I'm thinking there's
> something completely odd - and I wonder if it's necessary to
> use a special search-handler to achieve what  I want.
> 
> For example, if I execute 
> q=(skcode:2021051 AND flength:368.0)
> 
> I get no results. If I omit the parentheses, I get 1 result.
> (Let alone trying to combine several Boolean clauses).

Which query parser are you using?


RE: simple query help

2012-05-15 Thread Peter Kirk
Hi - thanks for the response. Yes I have tried with parentheses, to group as 
you suggest.

It doesn't make a difference. But now I'm thinking there's something completely 
odd - and I wonder if it's necessary to use a special search-handler to achieve 
what  I want.

For example, if I execute 
q=(skcode:2021051 AND flength:368.0)

I get no results. If I omit the parentheses, I get 1 result. (Let alone trying 
to combine several Boolean clauses).

/Peter


-Original Message-
From: András Bártházi [mailto:and...@barthazi.hu] 
Sent: 15. maj 2012 12:51
To: solr-user@lucene.apache.org
Subject: Re: simple query help

Hi,

You should use parantheses, have you tried that?
q=(skcode:2021051 and flength:368.0) or (skcode:2021049 and
ent_no:1040970907)

http://robotlibrarian.billdueber.com/solr-and-boolean-operators/

Bye,
  Andras

2012/5/15 Peter Kirk 

> Hi
>
> Can someone please give me some help with a simple query.
>
> If I search
> q=skcode:2021051 and flength:368.0
>
> I get 1 document returned (doc A)
>
> If I search
> q=skcode:2021049 and ent_no:1040970907
>
> I get 1 document returned (doc B)
>
>
> But if I search
> q=skcode:2021051 and flength:368.0 or skcode:2021049 and ent_no:1040970907
>
> I get no documents returned.
>
> Shouldn't I get both docA and docB?
>
> Thanks,
> Peter
>
>



Re: simple query help

2012-05-15 Thread András Bártházi
Hi,

You should use parantheses, have you tried that?
q=(skcode:2021051 and flength:368.0) or (skcode:2021049 and
ent_no:1040970907)

http://robotlibrarian.billdueber.com/solr-and-boolean-operators/

Bye,
  Andras

2012/5/15 Peter Kirk 

> Hi
>
> Can someone please give me some help with a simple query.
>
> If I search
> q=skcode:2021051 and flength:368.0
>
> I get 1 document returned (doc A)
>
> If I search
> q=skcode:2021049 and ent_no:1040970907
>
> I get 1 document returned (doc B)
>
>
> But if I search
> q=skcode:2021051 and flength:368.0 or skcode:2021049 and ent_no:1040970907
>
> I get no documents returned.
>
> Shouldn't I get both docA and docB?
>
> Thanks,
> Peter
>
>


Re: adding an OR to a fq makes some doc that matched not match anymore

2012-05-15 Thread Ahmet Arslan
> that does not change the results for
> me:
> 
> -suggest?q=suggest_terms:lap*&fq=type:P&fq=((-type:B))&debugQuery=true
> -found 1
> 
> -suggest?q=suggest_terms:lap*&fq=type:P&fq=((-type:B)+OR+name:aa)&debugQuery=true
> -found 0
> 

Negative clause and OR clause does not work like this.
fq=+*:* -type:B name:aa should work. 


Re: adding an OR to a fq makes some doc that matched not match anymore

2012-05-15 Thread jmlucjav
that does not change the results for me:

-suggest?q=suggest_terms:lap*&fq=type:P&fq=((-type:B))&debugQuery=true
-found 1

-suggest?q=suggest_terms:lap*&fq=type:P&fq=((-type:B)+OR+name:aa)&debugQuery=true
-found 0

looks like a bug?
xab

--
View this message in context: 
http://lucene.472066.n3.nabble.com/adding-an-OR-to-a-fq-makes-some-doc-that-matched-not-match-anymore-tp3983775p3983828.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: query with DATE FIELD AND RANGE query using dismax

2012-05-15 Thread Jan Høydahl
Hi,

You can't. Try eDisMax instead: http://wiki.apache.org/solr/ExtendedDisMax

--
Jan Høydahl, search solution architect
Cominvent AS - www.facebook.com/Cominvent
Solr Training - www.solrtraining.com

On 15. mai 2012, at 11:05, ayyappan wrote:

> Hi
> 
>   My queries are working with standard query handler but not in dismax.
> 
> *it is working fine *
> EX :
> q=scanneddate:["2012-02-02T01:30:52Z" TO "2011-09-22T22:40:30Z"] .
> 
> *Not Working :*
> EX
> defType=dismax&q=["2012-02-02T01:30:52Z" TO
> "2011-09-22T22:40:30Z"]&qf=scanneddate
> 
> How can I check for the date ranges  using solr's dismax query handler
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/query-with-DATE-FIELD-AND-RANGE-query-using-dismax-tp3983819.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: authentication for solr admin page?

2012-05-15 Thread findbestopensource
I have written an article on this. The various steps to restrict /
authenticate Solr admin interface.

http://www.findbestopensource.com/article-detail/restrict-solr-admin-access

Regards
Aditya
www.findbestopensource.com


On Thu, Mar 29, 2012 at 1:06 AM, geeky2  wrote:

> update -
>
> ok - i was reading about replication here:
>
> http://wiki.apache.org/solr/SolrReplication
>
> and noticed comments in the solrconfig.xml file related to HTTP Basic
> Authentication and the usage of the following tags:
>
> username
>password
>
> *Can i place these tags in the request handler to achieve an authentication
> scheme for the /admin page?*
>
> // snipped from the solrconfig.xml file
>
>   class="org.apache.solr.handler.admin.AdminHandlers"/>
>
> thanks for any help
> mark
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/authentication-for-solr-admin-page-tp3865665p3865747.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


query with DATE FIELD AND RANGE query using dismax

2012-05-15 Thread ayyappan
Hi

   My queries are working with standard query handler but not in dismax.

*it is working fine *
EX :
q=scanneddate:["2012-02-02T01:30:52Z" TO "2011-09-22T22:40:30Z"] .

*Not Working :*
EX
defType=dismax&q=["2012-02-02T01:30:52Z" TO
"2011-09-22T22:40:30Z"]&qf=scanneddate

How can I check for the date ranges  using solr's dismax query handler


--
View this message in context: 
http://lucene.472066.n3.nabble.com/query-with-DATE-FIELD-AND-RANGE-query-using-dismax-tp3983819.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Multi-words synonyms matching

2012-05-15 Thread Bernd Fehling
Without reading the whole thread let me say that you should not trust
the solr admin analysis. It takes the whole multiword search and runs
it all together at once through each analyzer step (factory).
But this is not how the real system works. First pitfall, the query parser
is also splitting at white space (if not a phrase query). Due to this,
a multiword query is send chunk after chunk through the analyzer and,
second pitfall, each chunk runs through the whole analyzer by its own.

So if you are dealing with multiword synonyms you have the following
problems. Either you turn your query into a phrase so that the whole
phrase is analyzed at once and therefore looked up as multiword synonym
but phrase queries are not analyzed !!! OR you send your query chunk
by chunk through the analyzer but then they are not multiwords anymore
and are not found in your synonyms.txt.

>From my experience I can say that it requires some deep work to get it done
but it is possible. I have connected a thesaurus to solr which is doing
query time expansion (no need to reindex if the thesaurus changes).
The thesaurus holds synonyms and "used for terms" in 24 languages. So
it is also some kind of language translation. And naturally the thesaurus
translates from single term to multi term synonyms and vice versa.

Regards,
Bernd


Am 14.05.2012 13:54, schrieb elisabeth benoit:
> Just for the record, I'd like to conclude this thread
> 
> First, you were right, there was no behaviour difference between fq and q
> parameters.
> 
> I realized that:
> 
> 1) my synonym (hotel de ville) has a stopword in it (de) and since I used
> tokenizerFactory="solr.KeywordTokenizerFactory" in my synonyms declaration,
> there was no stopword removal in the indewed expression, so when requesting
> "hotel de ville", after stopwords removal in query, Solr was comparing
> "hotel de ville"
> with "hotel ville"
> 
> but my queries never even got to that point since
> 
> 2) I made a mistake using "mairie" alone in the admin interface when
> testing my schema. The real field was something like "collectivités
> territoriales mairie",
> so the synonym "hotel de ville" was not even applied, because of the
> tokenizerFactory="solr.KeywordTokenizerFactory" in my synonym definition
> not splitting field into words when parsing
> 
> So my problem is not solved, and I'm considering solving it outside of Solr
> scope, unless someone else has a clue
> 
> Thanks again,
> Elisabeth
> 
> 
> 
> 2012/4/25 Erick Erickson 
> 
>> A little farther down the debug info output you'll find something
>> like this (I specified fq=name:features)
>>
>> 
>> name:features
>> 
>>
>>
>> so it may well give you some clue. But unless I'm reading things wrong,
>> your
>> q is going against a field that has much more information than the
>> CATEGORY_ANALYZED field, is it possible that the data from your
>> test cases simply isn't _in_ CATEGORY_ANALYZED?
>>
>> Best
>> Erick
>>
>> On Wed, Apr 25, 2012 at 9:39 AM, elisabeth benoit
>>  wrote:
>>> I'm not at the office until next Wednesday, and I don't have my Solr
>> under
>>> hand, but isn't debugQuery=on giving informations only about q parameter
>>> matching and nothing about fq parameter? Or do you mean
>>> "parsed_filter_querie"s gives information about fq?
>>>
>>> CATEGORY_ANALYZED is being populated by a copyField instruction in
>>> schema.xml, and has the same field type as my catchall field, the search
>>> field for my searchHandler (the one being used by q parameter).
>>>
>>> CATEGORY (a string) is copied in CATEGORY_ANALYZED (field type is text)
>>>
>>> CATEGORY (a string) is copied in catchall field (field type is text),
>> and a
>>> lot of other fields are copied too in that catchall field.
>>>
>>> So as far as I can see, the same analysis should be done in both cases,
>> but
>>> obviously I'm missing something, and the only thing I can think of is a
>>> different behavior between q and fq parameter.
>>>
>>> I'll check that parsed_filter_querie first thing in the morning next
>>> Wednesday.
>>>
>>> Thanks a lot for your help.
>>>
>>> Elisabeth
>>>
>>>
>>> 2012/4/24 Erick Erickson 
>>>
 Elisabeth:

 What shows up in the debug section of the response when you add
 &debugQuery=on? There should be some bit of that section like:
 "parsed_filter_queries"

 My other question is "are you absolutely sure that your
 CATEGORY_ANALYZED field has the correct content?". How does it
 get populated?

 Nothing jumps out at me here

 Best
 Erick

 On Tue, Apr 24, 2012 at 9:55 AM, elisabeth benoit
  wrote:
> yes, thanks, but this is NOT my question.
>
> I was wondering why I have multiple matches with q="hotel de ville"
>> and
 no
> match with fq=CATEGORY_ANALYZED:"hotel de ville", since in both case
>> I'm
> searching in the same solr fieldType.
>
> Why is q parameter behaving differently in that case? Why do the
>> quotes
> work in one case and 

simple query help

2012-05-15 Thread Peter Kirk
Hi

Can someone please give me some help with a simple query.

If I search
q=skcode:2021051 and flength:368.0

I get 1 document returned (doc A)

If I search
q=skcode:2021049 and ent_no:1040970907

I get 1 document returned (doc B)


But if I search
q=skcode:2021051 and flength:368.0 or skcode:2021049 and ent_no:1040970907

I get no documents returned.

Shouldn't I get both docA and docB?

Thanks,
Peter



Query regarding multi core search

2012-05-15 Thread ravicv
HI,

I want to configured 2 cores in my SOLR instance. Now i want to query core0
with different query and core1 with diffrent query and finally merge the
results .

Please suggest me the best way to do this .

Thanks 
Ravi

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-regarding-multi-core-search-tp3983813.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Boosting on field empty or not

2012-05-15 Thread Ahmet Arslan
> Basically I want documents that have a given field populated
> to have a
> higher score than the documents that dont.  So if you
> search for foo I want
> documents that contain foo, but i want the documents that
> have field a
> populated to have a higher score...


Hi Donald,

Since you are using edismax, it is better to use bq (boosting query) for this.

bq=reqularprice:[* TO *]^50

http://wiki.apache.org/solr/DisMaxQParserPlugin#bq_.28Boost_Query.29

defType=edismax&qf=nameSuggest^10 name^10 codeTXT^2 description^1 
brand_search^0 cat_search^10&q=chairs&bq=reqularprice:[* TO *]^50



Re: SOLR Security

2012-05-15 Thread Anupam Bhattacharya
Thanks for the suggestions.

I tried to use SolrJ within my Servlet. Although the SolrJ QueryResponse is
not returning a well formed Json Object.
I need the Json String with quotes as below. although
QueryResponse.toString() doesn't  return json with quotes at all.

jsonp1337064466204({"responseHeader":{"status":0,"QTime":0,"params":{"json.wrf":"jsonp1337064466204","facet":"true","facet.mincount":"1","q":"*:*","facet.limit":"-1","json.nl":"map","facet.field":["title","abstract"],"wt":"json","rows":"0"}},"response":{"numFound":0,"start":0,"docs":[]},"facet_counts":{"facet_queries":{},"facet_fields":{"title":{},"abstract":{}},"facet_dates":{},"facet_ranges":{}}})

Regards

Anupam


On Fri, May 11, 2012 at 7:56 PM, Welty, Richard wrote:

> in fact, there's a sample proxy.php on the ajax-solr web page which can
> easily be modified into a security layer. my solr servers only listen to
> requests issued by a narrow list of systems, and everything gets routed
> through a modified copy of the proxy.php file, which checks whether the
> user is logged in, and adds terms to the query to limit returned results to
> those the user is permitted to see.
>
>
> -Original Message-
> From: Jan Høydahl [mailto:j...@hoydahl.no]
> Sent: Fri 5/11/2012 9:45 AM
> To: solr-user@lucene.apache.org
> Subject: Re: SOLR Security
>
> Hi,
>
> There is nothing stopping you from pointing Ajax-SOLR to a URL on your
> app-server, which acts as a security insulation layer between the Solr
> backend and the world. In this (thin) layer you can analyze the input and
> choose carefully what to let through and not.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.facebook.com/Cominvent
> Solr Training - www.solrtraining.com
>
> On 11. mai 2012, at 06:37, Anupam Bhattacharya wrote:
>
> > Yes, I agree with you.
> >
> > But Ajax-SOLR Framework doesn't fit in that manner. Any alternative
> > solution ?
> >
> > Anupam
> >
> > On Fri, May 11, 2012 at 9:41 AM, Klostermeyer, Michael <
> > mklosterme...@riskexchange.com> wrote:
> >
> >> Instead of hitting the Solr server directly from the client, I think I
> >> would go through your application server, which would have access to all
> >> the users data and can forward that to the Solr server, thereby hiding
> it
> >> from the client.
> >>
> >> Mike
> >>
> >>
> >> -Original Message-
> >> From: Anupam Bhattacharya [mailto:anupam...@gmail.com]
> >> Sent: Thursday, May 10, 2012 9:53 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: SOLR Security
> >>
> >> I am using Ajax-Solr Framework for creating a search interface. The
> search
> >> interface works well.
> >> In my case, the results have document level security so by even indexing
> >> records with there authorized users help me to filter results per user
> >> based on the authentication of the user.
> >>
> >> The problem that I have to a pass always a parameter to the SOLR Server
> >> with userid={xyz} which one can figure out from the SOLR URL(ajax call
> url)
> >> using Firebug tool in the Net Console on Firefox and can change this
> >> parameter value to see others records which he/she is not authorized.
> >> Basically it is Cross Site Scripting Issue.
> >>
> >> I have read about some approaches for Solr Security like Nginx with
> Jetty
> >> & .htaccess based security.Overall what i understand from this is that
> we
> >> can restrict users to do update/delete operations on SOLR as well as we
> can
> >> restrict the SOLR admin interface to certain IPs also. But How can I
> >> restrict the {solr-server}/solr/select based results from access by
> >> different user id's ?
> >>
>
>
>
>


Re: problem with date searching.

2012-05-15 Thread ayyappan
if i use 
q=scanneddate:["2011-09-22T22:40:30Z" TO "2012-02-02T01:30:52Z"] .
it is working fine .
but when i tried with dismax query .it is not working .
EX :
select/?defType=dismax&q=["2011-09-22T22:40:30Z" TO
"2012-02-02T01:30:52Z"]&qf=scanneddate&version=2.2&start=0&rows=50&indent=on&wt=json&&debugQuery=on&true

please comment on the same.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/problem-with-date-searching-tp3961761p3983807.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problem with AND clause in multi core search query

2012-05-15 Thread ravicv
Thanks Tommaso .

Could you please tell me is their any way to get this scenario to get
worked?

http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=column1:"A";
AND column2:"B" 

Is their any way we can achieve this below scenario 

query : column1:"A" should searched in core0 and column2:"B" should be
searched in core1 and later the results from both queries should use
condition AND and give final response.?

since both will return common field as response.

For reference my schema is : 



   
   

Thanks 
Ravi


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-with-AND-clause-in-multi-core-search-query-tp3983800p3983806.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Show a portion of searchable text in Solr

2012-05-15 Thread Shameema Umer
Can somebody tell me where should I place the highlighting parameters, when
I did on the query, it is not working.
&hl=true&hl.requireFieldMatch=true&hl.fl=*

FYI: I am new to solr. My aim  is to have emphasis tags on the queried
words and need to display only the query relevant snippet of the 

Thanks
Shameema





On Mon, May 14, 2012 at 1:18 PM, Ahmet Arslan  wrote:

> > I have indexed very large documents, In some cases these
> > documents has
> > 100.000 characters. Is there a way to return a portion of
> > the documents
> > (lets say the 300 first characters) when i am querying
> > "Solr"?. Is there any
> > attribute to set in the schema.xml or solrconfig.xml to
> > achieve this?
>
> I have a set-up with very large documents too. Here is two different
> solutions that I have used in the past:
>
> 1) Use highlighting with hl.alternateField and hl.maxAlternateFieldLength
> http://wiki.apache.org/solr/HighlightingParameters
>
> 2) Create an extra field (indexed="false" and stored="true") using
> copyField just for display purposes. (&fl=shortField)
>
> 
> http://wiki.apache.org/solr/SchemaXml#Copy_Fields
>
> Also, didn't used by myself yet but I *think* this can be accomplished by
> using a custom Transformer too.
> http://wiki.apache.org/solr/DocTransformers
>