date:20110303

Re: Solr Admin Interface, reworked - Go on? Go away?

2011-03-03 Thread Stefan Matheis

Hey Guys,

you're completly right :) Will clean up the existing Code a little
bit, and create a JIRA-Ticket.

On Wed, Mar 2, 2011 at 11:32 PM, Chris Hostetter
 wrote:
> If you run into any issues where you can't replicate something
> in the existing JSPs (or accomplish some new desirable functionality)
> because the info is not available from a request handler, don't hesitate
> to open feature request jiras to get the functionality added (and the
> folks with java know how can work on patches)

Thanks Hoss. There is already one idea for the
FieldAnalysisRequestHandler, which came up last week while i tried to
build a new Analysis-Page. Will open a JIRA-one for that too

Regards
Stefan

Re: Dismax, q, q.alt, and defaultSearchField?

2011-03-03 Thread Jan Høydahl

Hi,

Try
q.alt={!dismax}banana

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 2. mars 2011, at 23.06, mrw wrote:

> We have two banks of Solr nodes with identical schemas.  The data I'm
> searching for is in both banks.
> 
> One has defaultSearchField set to field1, the other has defaultSearchField
> set to field2.
> 
> We need to support both user queries and facet queries that have no user
> content.  For the latter, it appears I need to use q.alt=*:*, so I am
> investigating also using q.alt for user content (e.g., q.alt=banana).
> 
> I run the following query:
> 
> q.alt=banana
> &defType=dismax
> &mm=1
> &tie=0.1
> &qf=field1+field2
> 
> 
> On bank one, I get the expected results, but on bank two, I get 0 results.
> 
> I noticed (via debugQuery=true), that when I use q.alt, it resolves using
> the defaultSearchField (e.g., field1:banana), not the value of the qf param. 
> Therefore, I get different results.
> 
> If I switched to using q for user queries and q.alt for facet queries, I
> would still get different results, because q would resolve against the
> fields in the qf param, and q.alt would resolve against the default search
> field.
> 
> Is there a way to override this behavior in order to get consistent results?
> 
> Thanks!
> 
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Dismax-q-q-alt-and-defaultSearchField-tp2621061p2621061.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Selection Between Solr and Relational Database

2011-03-03 Thread Bing Li

Dear all,

I have started to learn Solr for two months. At least right now, my system
runs good in a Solr cluster.

I have a question when implementing one feature in my system. When
retrieving documents by keyword, I believe Solr is faster than relational
database. However, if doing the following operations, I guess the
performance must be lower. Is it right?

What I am trying to do is listed as follows.

1) All of the documents in Solr have one field which is used to
differentiate them; different categories have different value in such a
field, e.g., Group; the documents are classified as "news", "sports",
"entertainment" and so on.

2) Retrieve all of them documents by the field, Group.

3) Besides the field of Group, another field called CreatedTime is also
existed. I will filter the documents retrieved by Group according to the
value of CreatedTime. The filtered documents are the final results I need.

I guess the operation performance is lower than relational database, right?
Could you please give me an explanation to that?

Best regards,
Li Bing

Re: Boost function problem with disquerymax

2011-03-03 Thread Gastone Penzo

You are right. it was not and index field. just stored
Thanx

2011/3/2 Yonik Seeley 

> On Wed, Mar 2, 2011 at 11:34 AM, Gastone Penzo 
> wrote:
> > HI,
> > for search i use disquery max
> > and a i want to boost a field with bf parameter like:
> > ...&bf=boost_has_img^5&
> > the boost_has_img field of my document is 3:
> > 3
> > if i see the results in debug query mode i can see:
> >   0.0 = (MATCH) FunctionQuery(int(boost_has_img)), product of:
> > 0.0 = int(boost_has_img)=0
> > 5.0 = boost
> > 0.06543833 = queryNorm
> > why the score is 0 if the value is 3 and the boost is 5???
>
> Solr thinks the value of boost_has_img is 0 for that document.
> Is boost_has_img an indexed field?
> If so, verify that the value is actually 3 for that specific document.
>
>
> -Yonik
> http://lucidimagination.com
>



-- 

Gastone Penzo
Webster Srl
www.webster.it
www.libreriauniversitaria.it

perfect match in dismax search

2011-03-03 Thread Gastone Penzo

How to obtain perfect match with dismax query??

es:

i want to search "hello i love you" with deftype=dismax in the title field
and i want to obtain results which title is exactly "hello i love you" with
all this terms
in this order.

Not less words or other.
how is it possilbe??

i tryed with +(hello i love you) but if i have a title which is "hello i
love you mum" it matches and i don't want!

Thanx


-- 

Gastone Penzo
Webster Srl
www.webster.it
www.libreriauniversitaria.it

Re: perfect match in dismax search

2011-03-03 Thread Markus Jelsma

Use either the string fieldType or a field with very little analysis 
(KeywordTokenizer + LowercaseFilter).

> How to obtain perfect match with dismax query??
> 
> es:
> 
> i want to search "hello i love you" with deftype=dismax in the title field
> and i want to obtain results which title is exactly "hello i love you" with
> all this terms
> in this order.
> 
> Not less words or other.
> how is it possilbe??
> 
> i tryed with +(hello i love you) but if i have a title which is "hello i
> love you mum" it matches and i don't want!
> 
> Thanx

Re: Selection Between Solr and Relational Database

2011-03-03 Thread Markus Jelsma

Well, an RDBMS can be very fast but Solr using fq can be very fast as well. 
Just try fq=group:sports&fq=createdtime:

> Dear all,
> 
> I have started to learn Solr for two months. At least right now, my system
> runs good in a Solr cluster.
> 
> I have a question when implementing one feature in my system. When
> retrieving documents by keyword, I believe Solr is faster than relational
> database. However, if doing the following operations, I guess the
> performance must be lower. Is it right?
> 
> What I am trying to do is listed as follows.
> 
> 1) All of the documents in Solr have one field which is used to
> differentiate them; different categories have different value in such a
> field, e.g., Group; the documents are classified as "news", "sports",
> "entertainment" and so on.
> 
> 2) Retrieve all of them documents by the field, Group.
> 
> 3) Besides the field of Group, another field called CreatedTime is also
> existed. I will filter the documents retrieved by Group according to the
> value of CreatedTime. The filtered documents are the final results I need.
> 
> I guess the operation performance is lower than relational database, right?
> Could you please give me an explanation to that?
> 
> Best regards,
> Li Bing

Re: Solr TermsComponent: space in term

2011-03-03 Thread shrinath.m

why was this thread left unanswered ? Is there no way to achieve what the Op
had to say ?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-TermsComponent-space-in-term-tp1898889p2624203.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Admin Interface, reworked - Go on? Go away?

2011-03-03 Thread Jan Høydahl

Hi,

This is simply great! Bravo!

This alone is worthy including, but I also (of course) have some comments/ideas:

The links section on top:
 * Move the links on top to bottom, reserving the top for navigation.
 * The "send email" could be changed to "Community forum" and instead of 
linking to 
   mailto:solr-user@lucene.apache.org, link to 
http://wiki.apache.org/solr/UsingMailingLists
 * Add a link to IRC chat. http://webchat.freenode.net/?channels=#solr
   That would surely increase the activity on the channel :)
 * Allow for custom links ala the admin-extra.html. Include html code from
   ${solr.solr.home}/admin-links.html - letting people add links to their own 
support etc.
 * Similarly for the top-section, allow including html code from
   ${solr.solr.home}/admin-navi.html - where you may add links to you "Master" 
Solr or whatever

Suggestion for new tabs for each core:
 * "Prototyping" - pointing to the "/browse" Velocity GUI. Very useful!!
 * "CoreAdmin" - Buttons "reload core", "remove core", "rename"...

In the "System" tab for each core, it would be great to show a number of key 
info:
 * # docs
 * Size of index (Mb)
 * Last add/delete timestamp
 * Optimized status (with a button to optimize now)
 * Button to reload core now (reloads config)

On the "Query" tab for each core:
 * Add a button "Delete docs matching this query"
   (With a JavaScript popup box "are you sure"? :)
 * Add an input box for "query type", setting the "qt" param
 * Adding a some links below the input boxes, expanding by JavaScript:
   - dismax params
   - spatial params
   - spellcheck params
   - faceting params

Should there also be a tab above all cores, with host-wide stuff?
 * Solr version
 * Host name, port
 * Solr HOME path
 * Zookeeper info and link
 * Core Admin (create new core)

Improve the admin-extra.html concept:
Today, if the file admin-extra.html exists it will be included near
top of current admin GUI. This can be useful, but in this new design, it
perhaps makes more sense to include the admin-extra.html contents in
a widget box on each core. Then each organization can customize and put
links to their internal issue trackers etc..

Include a Dev/Test/Prod indication:
It is common to have three different environments, one for test, one for
development and one live production. It happens now and then that you do the
wrong action on the wrong server :( so a visual clue as to which environemnt
you're in is very useful.
I propose a simple solid bar on the very top which is RED for prod, YELLOW
for test and GREEN for dev. Would it be possible to read a Java system property
-Dsolr.environment=dev and based on that set the color of such a top-bar?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 2. mars 2011, at 21.47, Stefan Matheis wrote:

> Hi List,
> 
> given that fact that my java-knowledge is sort of non-existing .. my idea was 
> to rework the Solr Admin Interface.
> 
> Compared to CouchDBs Futon or the MongoDB Admin-Utils .. not that fancy, but 
> it was an idea few weeks ago - and i would like to contrib something, a thing 
> which has to be non-java but not useless - hopefully ;)
> 
> Actually it's completly work-in-progress .. but i'm interested in what you 
> guys think. Right direction? Completly Wrong, just drop it?
> 
> http://files.mathe.is/solr-admin/01_dashboard.png
> http://files.mathe.is/solr-admin/02_query.png
> http://files.mathe.is/solr-admin/03_schema.png
> http://files.mathe.is/solr-admin/04_analysis.png
> http://files.mathe.is/solr-admin/05_plugins.png
> 
> It's actually using one index.jsp to generate to basic frame, including cores 
> and their navigation. Everything else is loaded via existing SolrAdminHandler.
> 
> Any Questions, Ideas, Thoughts outta there? Please, let me know :)
> 
> Regards
> Stefan

Re: perfect match in dismax search

2011-03-03 Thread Jan Høydahl

Hi,

I'm working on a Filter which enables boundary match using syntax title:"^hello 
I love you$"
which will make sure that the match is exact. See SOLR-1980 (no working patch 
yet)

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 3. mars 2011, at 11.07, Markus Jelsma wrote:

> Use either the string fieldType or a field with very little analysis 
> (KeywordTokenizer + LowercaseFilter).
> 
>> How to obtain perfect match with dismax query??
>> 
>> es:
>> 
>> i want to search "hello i love you" with deftype=dismax in the title field
>> and i want to obtain results which title is exactly "hello i love you" with
>> all this terms
>> in this order.
>> 
>> Not less words or other.
>> how is it possilbe??
>> 
>> i tryed with +(hello i love you) but if i have a title which is "hello i
>> love you mum" it matches and i don't want!
>> 
>> Thanx

Date range query with mixed inclusive/exclusive

2011-03-03 Thread Tim Terlegård

Is there any chance that
https://issues.apache.org/jira/browse/LUCENE-996 will be backported to
the 3x branch? I see that it's fixed in trunk, but it will be a while
until it's in a release.

How do people generally search for documents from lets say year 2009?
I thought it would be convenient to do something like:
publication:[2009-01-01T00:00:000Z TO 2010-01-01T00:00:000Z}

But now that there seems to be a bug that prevents this [...} kind of
search. So do people generally search like this?
publication:[2009-01-01T00:00:000Z TO 2009-12-31T23:59:999Z]

/Tim

Re: Solr TermsComponent: space in term

2011-03-03 Thread Ahmet Arslan

> Is there no way to achieve what the Op
> had to say ?
> 

TermsComponent operates on indexed terms. One way to achieve multi-word 
suggestions is to use ShingleFilterFactory at index time.

Re: Solr TermsComponent: space in term

2011-03-03 Thread shrinath.m


iorixxx wrote:
> 
> TermsComponent operates on indexed terms. One way to achieve multi-word
> suggestions is to use ShingleFilterFactory at index time.
> 

Thank you @iorixxx.
Could you point me where I can find a good docs on how to do this ?  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-TermsComponent-space-in-term-tp1898889p2624429.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr TermsComponent: space in term

2011-03-03 Thread Markus Jelsma

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory

On Thursday 03 March 2011 12:15:07 shrinath.m wrote:
> iorixxx wrote:
> > TermsComponent operates on indexed terms. One way to achieve multi-word
> > suggestions is to use ShingleFilterFactory at index time.
> 
> Thank you @iorixxx.
> Could you point me where I can find a good docs on how to do this ?
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-TermsComponent-space-in-term-tp189
> 8889p2624429.html Sent from the Solr - User mailing list archive at
> Nabble.com.

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

adding a document using curl

2011-03-03 Thread Ken Foskey



I have read the various pages and used Curl a lot but i cannot figure out 
the correct command line to add a document to the example Solr instance.


I have tried a few things however they seem to be for the file on the same 
server as solr,  in my case I am pushing the document from a windows machine 
to Solr for indexing.


Ta
Ken

Re: adding a document using curl

2011-03-03 Thread Markus Jelsma

Here's a complete example
http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_parameters_as_part_of_the_URL

On Thursday 03 March 2011 12:31:11 Ken Foskey wrote:
> I have read the various pages and used Curl a lot but i cannot figure out
> the correct command line to add a document to the example Solr instance.
> 
> I have tried a few things however they seem to be for the file on the same
> server as solr,  in my case I am pushing the document from a windows
> machine to Solr for indexing.
> 
> Ta
> Ken

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Solr TermsComponent: space in term

2011-03-03 Thread shrinath.m


Markus Jelsma-2 wrote:
> 
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory
> 
well, thank you Markus, 

Now My schema has the following : 














if I run a query like this : 

http://localhost:8983/solr/select?rows=0&q=c&facet=true&facet.field=text&facet.mincount=1&facet.prefix=com

I get output saying : 


1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1



how do I restrict it to only those words present in the documents and not
something like "compliance w" ?


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-TermsComponent-space-in-term-tp1898889p2624547.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Looking for help with Solr implementation

2011-03-03 Thread Anurag

What is the problem that you are facing in the working of solr? we have done
a project on it and it would be good if you send the details on what to
implement in the project

-
Kumar Anurag

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Looking-for-help-with-Solr-implementation-tp1886329p2624557.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: adding a document using curl

2011-03-03 Thread Ken Foskey

On Thu, 2011-03-03 at 12:36 +0100, Markus Jelsma wrote:
> Here's a complete example
> http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_parameters_as_part_of_the_URL

I should have been clearer.   A rich text document,  XML I can make work
and a script is in the example docs folder

http://wiki.apache.org/solr/ExtractingRequestHandler

I also read the solr 1.4 book and tried samples in there,   could not
make them work.

Ta


> On Thursday 03 March 2011 12:31:11 Ken Foskey wrote:
> > I have read the various pages and used Curl a lot but i cannot figure out
> > the correct command line to add a document to the example Solr instance.
> > 
> > I have tried a few things however they seem to be for the file on the same
> > server as solr,  in my case I am pushing the document from a windows
> > machine to Solr for indexing.
> > 
> > Ta
> > Ken
>

Re: adding a document using curl

2011-03-03 Thread pankaj bhatt

Hi All,
   is there any Custom open source SOLR ADMIN application like what
lucid imagination provides in its distribution.
   I am trying to create thing, however thinking it would be a
reinventing of wheel.

   Request you to please redirect me, if there is any open source
application that can be used.
   Waiting for your answer.

/ Pankaj Bhatt.

Custom SOLR ADMIN Application

2011-03-03 Thread pankaj bhatt

Hi All,
   is there any Custom open source SOLR ADMIN application like what
lucid imagination provides in its distribution.
   I am trying to create thing, however thinking it would be a
reinventing of wheel.

   Request you to please redirect me, if there is any open source
application that can be used.
   Waiting for your answer.

/ Pankaj Bhatt.

Re: AlternateDistributedMLT.patch not working

2011-03-03 Thread Edoardo Tosca

Hi all,
I am currently working on this AlternateDistributedMLT patch.
I've applied it manually on solr 1.4 an solved some Null Pointer Exception
issues.
It's now working properly.

But I'm not sure about its behaviour so i'll ask you, list:

I saw that every MLT query for a doc that is in the resultset runs only on
its shard (the one where the doc is in the index).
This means that you can miss documents, probably related to the doc but not
retrieved because they belong to other shards.

Does it make sense?
Is it the expected behavoiur?

If it is, i can submit the patch so then at least it works on solr 1.4.0

Thanks,

Edo

On Wed, Feb 23, 2011 at 6:53 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Hi Isha,
>
> The patch is out of date.  You need to look at the patch and rejection and
> update your local copy of the code to match the logic from the patch, if
> it's
> still applicable to the version of Solr source code you have.
>
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
> > From: Isha Garg 
> > To: solr-user@lucene.apache.org
> > Sent: Tue, February 22, 2011 2:13:23 AM
> > Subject: AlternateDistributedMLT.patch not working
> >
> > Hello,
> >
> >  I tried to use SOLR-788 with solr1.4 so that  distributed MLT works
> well .
> >While working with this patch i got an error mesg  like
> >
> > 1 out of 1 hunk FAILED -- saving rejects to file
> >src/java/org/apache/solr/handler/component/MoreLikeThisComponent.java.rej
> >
> > Can  anybody help me out?
> >
> > Thanks!
> > Isha Garg
> >
> >
>

-- 
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com

Re: adding a document using curl

2011-03-03 Thread Gary Taylor


As an example, I run this in the same directory as the msword1.doc file:

curl 
"http://localhost:8983/solr/core0/update/extract?literal.docid=74&literal.type=5"; 
-F "file=@msword1.doc"


The "type" literal is just part of my schema.

Gary.


On 03/03/2011 11:45, Ken Foskey wrote:

On Thu, 2011-03-03 at 12:36 +0100, Markus Jelsma wrote:

Here's a complete example
http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_parameters_as_part_of_the_URL

I should have been clearer.   A rich text document,  XML I can make work
and a script is in the example docs folder

http://wiki.apache.org/solr/ExtractingRequestHandler

I also read the solr 1.4 book and tried samples in there,   could not
make them work.

Ta

Re: Solr Admin Interface, reworked - Go on? Go away?

2011-03-03 Thread mrw


Picture the URI field above the response field, only half-screen.  This
facilitates breaking the query apart on different lines in order to debug
it.  

When you have a lot of shards, fq clauses, etc., you end up with a very long
URI that is difficult to get your head around and manipulate.  We take
queries from the logs, split them around parameters, take the shards out,
put the shards back in, take the OLS labels out, put them back in, etc. 
With long, complex queries, it's essential to have a large work space to
play in. :)




Stefan Matheis wrote:
> 
> mrw,
> 
> you mean a field like here 
> (http://files.mathe.is/solr-admin/02_query.png) on the right side, 
> between meta-navigation and plain solr-xml response?
> 
> actually it's just to display the computed url, but if so .. we could 
> use a larger field for that, of course :)
> 
> Regards
> Stefan
> 
> Am 02.03.2011 22:31, schrieb mrw:
>>
>> Looks nice.
>>
>> Might be also worth it to create a page with large query field for
>> pasting
>> in complete URL-encoded queries that cross cores, etc.  I did that at
>> work
>> (via ASP.net) so we could paste in queries from logs and debug them.  We
>> tend to use that quite a bit.
>>
>>
>> Cheers
>>
>>
>> Stefan Matheis wrote:
>>>
>>> Hi List,
>>>
>>> given that fact that my java-knowledge is sort of non-existing .. my
>>> idea was to rework the Solr Admin Interface.
>>>
>>> Compared to CouchDBs Futon or the MongoDB Admin-Utils .. not that fancy,
>>> but it was an idea few weeks ago - and i would like to contrib
>>> something, a thing which has to be non-java but not useless - hopefully
>>> ;)
>>>
>>> Actually it's completly work-in-progress .. but i'm interested in what
>>> you guys think. Right direction? Completly Wrong, just drop it?
>>>
>>> http://files.mathe.is/solr-admin/01_dashboard.png
>>> http://files.mathe.is/solr-admin/02_query.png
>>> http://files.mathe.is/solr-admin/03_schema.png
>>> http://files.mathe.is/solr-admin/04_analysis.png
>>> http://files.mathe.is/solr-admin/05_plugins.png
>>>
>>> It's actually using one index.jsp to generate to basic frame, including
>>> cores and their navigation. Everything else is loaded via existing
>>> SolrAdminHandler.
>>>
>>> Any Questions, Ideas, Thoughts outta there? Please, let me know :)
>>>
>>> Regards
>>> Stefan
>>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Solr-Admin-Interface-reworked-Go-on-Go-away-tp2620365p2620745.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Admin-Interface-reworked-Go-on-Go-away-tp2620365p2624956.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Understanding multi-field queries with q and fq

2011-03-03 Thread mrw

Yes, we're investigating dismax (with the qf param), but we're not sure it
supports our syntax needs.  The users want to put put AND/OR/NOT in their
queries, and we don't want to write a lot of code converting those queries
into dismax (+/-/mm) format.  So, until 3.1 (edismax) ships, we're also
trying to get boolean queries to work across multiple fields with the
standard query handler.

I've seen quite a few unanswered or partially-answered posts on this list on
getting boolean syntax right.  I can tell it's a thorny issue.


Robert Sandiford wrote:
> 
> Have you looked at the 'qf' parameter?
> 
> Bob Sandiford | Lead Software Engineer | SirsiDynix
> P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com
> www.sirsidynix.com 
> _
> http://www.cosugi.org/ 
> 
> 
> 
> 
>> -Original Message-
>> From: mrw [mailto:mikerobertsw...@gmail.com]
>> Sent: Wednesday, March 02, 2011 2:28 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Understanding multi-field queries with q and fq
>> 
>> Anyone understand how to do boolean logic across multiple fields?
>> 
>> Dismax is nice for searching multiple fields, but doesn't necessarily
>> support our syntax requirements. eDismax appears to be not available
>> until
>> Solr 3.1.
>> 
>> In the meantime, it looks like we need to support applying the user's
>> query
>> to multiple fields, so if the user enters "led zeppelin merle" we need
>> to be
>> able to do the logical equivalent of
>> 
>> &fq=field1:led zeppelin merle OR field2:led zeppelin merle
>> 
>> 
>> Any ideas?  :)
>> 
>> 
>> 
>> mrw wrote:
>> >
>> > After searching this list, Google, and looking through the Pugh book,
>> I am
>> > a little confused about the right way to structure a query.
>> >
>> > The Packt book uses the example of the MusicBrainz DB full of song
>> > metadata.  What if they also had the song lyrics in English and
>> German as
>> > files on disk, and wanted to index them along with the metadata, so
>> that
>> > each document would basically have song title, artist, publisher,
>> date,
>> > ..., All_Metadata (copy field of all metadata fields), Text_English,
>> and
>> > Text_German fields?
>> >
>> > There can only be one default field, correct?  So if we want to
>> search for
>> > all songs containing (zeppelin AND (dog OR merle)) do we
>> >
>> > repeat the entire query text for all three major fields in the 'q'
>> clause
>> > (assuming we don't want to use the cache):
>> >
>> > q=(+All_Metadata:zeppelin AND (dog OR merle)+Text_English:zeppelin
>> AND
>> > (dog OR merle)+Text_German:(zeppelin AND (dog OR merle))
>> >
>> > or repeat the entire query text for all three major fields in the
>> 'fq'
>> > clause (assuming we want to use the cache):
>> >
>> > q=*:*&fq=(+All_Metadata:zeppelin AND (dog OR
>> merle)+Text_English:zeppelin
>> > AND (dog OR merle)+Text_German:zeppelin AND (dog OR merle))
>> >
>> > ?
>> >
>> > Thanks!
>> >
>> 
>> 
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Understanding-multi-field-queries-
>> with-q-and-fq-tp2528866p2619700.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Understanding-multi-field-queries-with-q-and-fq-tp2528866p2625068.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr TermsComponent: space in term

2011-03-03 Thread Ahmet Arslan



You need to remove EdgeNGramFilterFactory from your analyzer chain.



--- On Thu, 3/3/11, shrinath.m  wrote:

> From: shrinath.m 
> Subject: Re: Solr TermsComponent: space in term
> To: solr-user@lucene.apache.org
> Date: Thursday, March 3, 2011, 1:41 PM
> 
> Markus Jelsma-2 wrote:
> > 
> > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory
> > 
> well, thank you Markus, 
> 
> Now My schema has the following : 
> 
> 
>             
>                 
>                 
>                 
>                 
>         
>                 
>                 
>                 
>             
>         
> 
> if I run a query like this : 
> 
> http://localhost:8983/solr/select?rows=0&q=c&facet=true&facet.field=text&facet.mincount=1&facet.prefix=com
> 
> I get output saying : 
> 
> 
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 
> 
> 
> how do I restrict it to only those words present in the
> documents and not
> something like "compliance w" ?
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-TermsComponent-space-in-term-tp1898889p2624547.html
> Sent from the Solr - User mailing list archive at
> Nabble.com.
>

Re: adding a document using curl

2011-03-03 Thread Gora Mohanty

On Thu, Mar 3, 2011 at 5:15 PM, Ken Foskey  wrote:
> On Thu, 2011-03-03 at 12:36 +0100, Markus Jelsma wrote:
>> Here's a complete example
>> http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_parameters_as_part_of_the_URL
>
> I should have been clearer.   A rich text document,  XML I can make work
> and a script is in the example docs folder
>
> http://wiki.apache.org/solr/ExtractingRequestHandler
>
> I also read the solr 1.4 book and tried samples in there,   could not
> make them work.
[...]

Please provide details on what exactly is not working for you, and the
corresponding error message from the Solr logs. E.g., something like
"I tried posting ABC document to Solr, using XYZ commands", and
include the part from the Solr logs relating to the exception that you
get. After that, further details might be needed, but without the above
it is nigh impossible to guess at what you are trying.

Regards,
Gora

Re: adding a document using curl

2011-03-03 Thread Gora Mohanty

On Thu, Mar 3, 2011 at 5:31 PM, pankaj bhatt  wrote:
> Hi All,
>       is there any Custom open source SOLR ADMIN application like what
> lucid imagination provides in its distribution.
>       I am trying to create thing, however thinking it would be a
> reinventing of wheel.
>
>       Request you to please redirect me, if there is any open source
> application that can be used.
>       Waiting for your answer.
[...]

Please do not hijack an existing thread, but start a new one if
you want to discuss a new topic. On Hoss' behalf :-)
http://people.apache.org/~hossman/#threadhijack

Regards,
Gora

Re: adding a document using curl

2011-03-03 Thread Jayendra Patil

If you are using the ExtractingRequestHandler, you can also try using
the stream.file or stream.url.

e.g. curl 
"http://localhost:8080/solr/core0/update/extract?stream.file=C:/777045.zip&literal.id=777045&literal.title=Test&commit=true";

More detailed explaination @
http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Content-Extraction-Tika

The literal prefix attributes with normal fields and the content
extracted from the document is stored in the text field by default

Regards,
Jayendra

On Thu, Mar 3, 2011 at 7:16 AM, Gary Taylor  wrote:
> As an example, I run this in the same directory as the msword1.doc file:
>
> curl
> "http://localhost:8983/solr/core0/update/extract?literal.docid=74&literal.type=5";
> -F "file=@msword1.doc"
>
> The "type" literal is just part of my schema.
>
> Gary.
>
>
> On 03/03/2011 11:45, Ken Foskey wrote:
>>
>> On Thu, 2011-03-03 at 12:36 +0100, Markus Jelsma wrote:
>>>
>>> Here's a complete example
>>>
>>> http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_parameters_as_part_of_the_URL
>>
>> I should have been clearer.   A rich text document,  XML I can make work
>> and a script is in the example docs folder
>>
>> http://wiki.apache.org/solr/ExtractingRequestHandler
>>
>> I also read the solr 1.4 book and tried samples in there,   could not
>> make them work.
>>
>> Ta
>>
>>
>
>

error in log INFO org.apache.solr.core.SolrCore - webapp=/solr path=/admin/ping params={} status=0 QTime=1

2011-03-03 Thread Mike Franon

I am using solr under jboss, so this might be more of a jboss config
issue, not really sure.  But my logs keep getting spammed, because
solr sends it as ERROR [STDERR] INFO org.apache.solr.core.SolrCore -
webapp=/solr path=/admin/ping params={} status=0 QTime=1

Has anyone seen this and found a workaround to not send this as an Error?

Thanks,
Mike

Content-Type of XMLResponseWriter / QueryResponseWriter

2011-03-03 Thread Bernd Fehling

Dear list,

is there any deeper logic behind the fact that XMLResponseWriter
is sending CONTENT_TYPE_XML_UTF8="application/xml; charset=UTF-8" ?

I would assume (and also most browser) that for XML Output
to receive "text/xml" and not "application/xml".

Or do you want the browser to call and XML-Editor with the result?

Best regards, Bernd

deletedPKQuery does not perform with compound PK

2011-03-03 Thread Jérôme Droz


Hello,

I'm using a DIH to import documents from a database. Documents in the 
index represent a relationship between two entities, units and 
dealpoints ("unit has dealpoint"); thus document keys in the index refer 
to a compound SQL key. Full import works fine. In order to optimize the 
import process, I configured both the database and DIH configuration 
file for delta-import.


I added 3 more tables, updated by triggers: a table tracking 
modification time of units, another one tracking modification time of 
dealpoints, and the last one used to track deleted "units having a 
dealpoint".


The uniqueKey field of the schema is defined as follows:

required="true" multiValued="false" />

...
id

Keys are generated by concatenating the unit id and the dealpoint id, 
separated by '-', in the SQL query.


Below is a sample of the data-config.xml I'm using (the original one is 
quite huge and may be confusing):





query="select concat_ws('-', cast(u.unit_id as char), 
cast(dp.deal_point_id as char)) as id, ...

from unit u, deal_point dp, ... where ..."
deltaQuery="select us.unit_id as unit_id, dps.deal_point_id 
as dealpoint_id
from unit_state us, deal_point_state dps where 
us.unit_state_last_mod > '${dataimporter.last_index_time}' or 
dps.deal_point_state_last_mod > '${dataimporter.last_index_time}'"
deltaImportQuery="select concat_ws('-', cast(u.unit_id as 
char), cast(dp.deal_point_id as char)) as id, ...
from unit u, deal_point dp, ... where (u.unit_id = 
'${dataimporter.delta.unit_id}' or dp.deal_point_id = 
'${dataimporter.delta.dealpoint_id}') and ..."

deletedPKQuery="select id from unit_deal_point_delete">
...




I specifically choose to track deleted entities in a dedicated 
(unit_deal_point_delete) table in order to prevent the known (and 
apparently unsolved) bugs described here:

https://issues.apache.org/jira/browse/SOLR-1229?focusedCommentId=12722427&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12722427

The id field in the unit_deal_point_delete table has the exact same 
representation as the document keys. Below is an example of a trigger:


create trigger unit_delete_before before delete on unit
for each row
begin
insert ignore into unit_deal_point_delete (id) select 
concat_ws('-', cast(old.unit_id as char), cast(dpu.deal_point_id as 
char)) from deal_point_unit dpu where dpu.unit_id = old.unit_id;

end;

Delta and delta-import queries works fine, but the deletedPKQuery seems 
to always return 0 rows, although the unit_deal_point_delete table is 
obviously not empty. No errors written in the logs, but:


Mar 3, 2011 11:23:49 AM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta

INFO: Completed DeletedRowKey for Entity: unitdealpoints rows obtained : 0

I have tested it with versions 1.4.0 & 1.4.1 and the result is the same: 
documents are not deleted.


What is the problem? Am I missing something?

Kind regards
--
Jerome Droz

Why is SolrDispatchFilter using 90% of the Time?

2011-03-03 Thread Stijn Vanhoorelbeke

Hi,

I'm working with a recent NightlyBuild of Solr and I'm doing some serious
ZooKeeper testing.
I've NewRelic monitoring enabled on my solr machines.

When I look at the distribution of the Response-time I notice
'SolrDispatchFilter.doFilter()' is taking up 90% of the time.
The other 10% is used by SolrSeacher and the QueryComponent.

+ Can anyone explain me why SolrDispatchFilter is consuming so much time?
++ Can I do something to lower this number?
 ( After all SolrDispatchFilter must Dispatch each time to the standard
searcher. )

Stijn Vanhoorelbeke

uniqueKey merge documents on commit

2011-03-03 Thread Tim Gilbert

Hi,

 

I have a unique key within my index, but rather than the default
behavour of overwriting I am wondering if there is a method to "merge"
the two different documents on commit of the second document.  I have a
testcase which explains what I'd like to happen:

 

@Test

  public void testMerge() throws SolrServerException, IOException

  {

SolrInputDocument doc1 = new SolrInputDocument();

doc1.addField("secid", "testid");

doc1.addField("value1_i", 1);



SolrAllSec.GetSolrServer().add(doc1);

SolrAllSec.GetSolrServer().commit();



SolrInputDocument doc2 = new SolrInputDocument();

doc2.addField("secid", "testid");

doc2.addField("value2_i", 2);

 

SolrAllSec.GetSolrServer().add(doc2);

SolrAllSec.GetSolrServer().commit();



SolrQuery solrQuery = new  SolrQuery();

solrQuery = solrQuery.setQuery("secid:testid");

QueryResponse response =
SolrAllSec.GetSolrServer().query(solrQuery, METHOD.GET);



List result = response.getResults();

Assert.isTrue(result.size() == 1);

Assert.isTrue(result.contains("value1"));

Assert.isTrue(result.contains("value2"));

  } 

 

Other than reading "doc1" and adding the fields from "doc2" and
recommitting, is there another way?

 

Thanks in advance,

 

Tim

Re: Content-Type of XMLResponseWriter / QueryResponseWriter

2011-03-03 Thread Walter Underwood

Never use text/xml, that overrides any encoding declaration inside the XML file.

http://ln.hixie.ch/?start=1037398795&count=1
http://www.grauw.nl/blog/entry/489

wunder
==
Lead Engineer, MarkLogic

On Mar 3, 2011, at 7:30 AM, Bernd Fehling wrote:

> Dear list,
> 
> is there any deeper logic behind the fact that XMLResponseWriter
> is sending CONTENT_TYPE_XML_UTF8="application/xml; charset=UTF-8" ?
> 
> I would assume (and also most browser) that for XML Output
> to receive "text/xml" and not "application/xml".
> 
> Or do you want the browser to call and XML-Editor with the result?
> 
> Best regards, Bernd

Omit hour-min-sec in search?

2011-03-03 Thread bbarani

Hi,

Is there a way to omit hour-min-sec in SOLR date field during search?

I have indexed a field using TrieDateField and seems like it uses UTC
format. The dates get stored as below,

lastupdateddate">2008-02-26T20:40:30.94Z

I want to do a search based on just -MM-DD and omit T20:40:30.94Z.. Not
sure if its feasible, just want to check if its possible.

Also most of the data in our source doesnt have time information hence we
are very much interested in just storing the date without time or even if
its stored with some default timestamp we want to search just using date
without using the timestamp.

Thanks,
Barani



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Omit-hour-min-sec-in-search-tp2625840p2625840.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Admin Interface, reworked - Go on? Go away?

2011-03-03 Thread Stefan Matheis

Hey Jan,

On Thu, Mar 3, 2011 at 11:37 AM, Jan Høydahl  wrote:
> This alone is worthy including, but I also (of course) have some 
> comments/ideas: [...]

Really nice! i'll try to make a list of open todos / missing items and
attach it to the JIRA-Ticket. Especially for the dismax- &
spatial-query-params, i would need some information about (not used
until now) - but i think these are smaller problems, regarding the
complete task :>

Regards
Stefan

Re: SolrJ Tutorial

2011-03-03 Thread Bing Li

Dear Lance,

Could you tell me where I can find the unit tests code?

I appreciate so much for your help!

Best regards,
LB

On Sat, Jan 22, 2011 at 3:58 PM, Lance Norskog  wrote:

> The unit tests are simple and show the steps.
>
> Lance
>
> On Fri, Jan 21, 2011 at 10:41 PM, Bing Li  wrote:
> > Hi, all,
> >
> > In the past, I always used SolrNet to interact with Solr. It works great.
> > Now, I need to use SolrJ. I think it should be easier to do that than
> > SolrNet since Solr and SolrJ should be homogeneous. But I cannot find a
> > tutorial that is easy to follow. No tutorials explain the SolrJ
> programming
> > step by step. No complete samples are found. Could anybody offer me some
> > online resources to learn SolrJ?
> >
> > I also noticed Solr Cell and SolrJ POJO. Do you have detailed resources
> to
> > them?
> >
> > Thanks so much!
> > LB
> >
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>

Re: Omit hour-min-sec in search?

2011-03-03 Thread Shane Perry

Not sure if there is a means of doing explicitly what you ask, but you
could do a date range:

+mydate:[-MM-DD 0:0:0 TO -MM-DD 11:59:59]

On Thu, Mar 3, 2011 at 9:14 AM, bbarani  wrote:
> Hi,
>
> Is there a way to omit hour-min-sec in SOLR date field during search?
>
> I have indexed a field using TrieDateField and seems like it uses UTC
> format. The dates get stored as below,
>
> lastupdateddate">2008-02-26T20:40:30.94Z
>
> I want to do a search based on just -MM-DD and omit T20:40:30.94Z.. Not
> sure if its feasible, just want to check if its possible.
>
> Also most of the data in our source doesnt have time information hence we
> are very much interested in just storing the date without time or even if
> its stored with some default timestamp we want to search just using date
> without using the timestamp.
>
> Thanks,
> Barani
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Omit-hour-min-sec-in-search-tp2625840p2625840.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

FilterQuery OR statement

2011-03-03 Thread Tanner Postert

Trying to figure out how I can run something similar to this for the fq
parameter

Field1 in ( 1, 2, 3 4 )
AND
Field2 in ( 4, 5, 6, 7 )

I found some examples on the net that looked like this: &fq=+field1:(1 2 3
4) +field2(4 5 6 7) but that yields no results.

Re: Dismax, q, q.alt, and defaultSearchField?

2011-03-03 Thread mrw

Thanks, Jan.

It looks like we need to do is use both q and q.alt, such that q.alt is
always "*:*" and q is either empty for filter-only queries, or has the user
text.  That seems to work.


Jan Høydahl / Cominvent wrote:
> 
> Hi,
> 
> Try
> q.alt={!dismax}banana
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> 
> On 2. mars 2011, at 23.06, mrw wrote:
> 
>> We have two banks of Solr nodes with identical schemas.  The data I'm
>> searching for is in both banks.
>> 
>> One has defaultSearchField set to field1, the other has
>> defaultSearchField
>> set to field2.
>> 
>> We need to support both user queries and facet queries that have no user
>> content.  For the latter, it appears I need to use q.alt=*:*, so I am
>> investigating also using q.alt for user content (e.g., q.alt=banana).
>> 
>> I run the following query:
>> 
>> q.alt=banana
>> &defType=dismax
>> &mm=1
>> &tie=0.1
>> &qf=field1+field2
>> 
>> 
>> On bank one, I get the expected results, but on bank two, I get 0
>> results.
>> 
>> I noticed (via debugQuery=true), that when I use q.alt, it resolves using
>> the defaultSearchField (e.g., field1:banana), not the value of the qf
>> param. 
>> Therefore, I get different results.
>> 
>> If I switched to using q for user queries and q.alt for facet queries, I
>> would still get different results, because q would resolve against the
>> fields in the qf param, and q.alt would resolve against the default
>> search
>> field.
>> 
>> Is there a way to override this behavior in order to get consistent
>> results?
>> 
>> Thanks!
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Dismax-q-q-alt-and-defaultSearchField-tp2621061p2621061.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Dismax-q-q-alt-and-defaultSearchField-tp2621061p2627134.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: FilterQuery OR statement

2011-03-03 Thread Ahmet Arslan

> Trying to figure out how I can run
> something similar to this for the fq
> parameter
> 
> Field1 in ( 1, 2, 3 4 )
> AND
> Field2 in ( 4, 5, 6, 7 )
> 
> I found some examples on the net that looked like this:
> &fq=+field1:(1 2 3
> 4) +field2(4 5 6 7) but that yields no results.

May be your default operator is set to AND in schema.xml?
If yes, try using +field2(4 OR 5 OR 6 OR 7)

Re: uniqueKey merge documents on commit

2011-03-03 Thread Jonathan Rochkind


Nope, there is not.

On 3/3/2011 10:55 AM, Tim Gilbert wrote:

Hi,



I have a unique key within my index, but rather than the default
behavour of overwriting I am wondering if there is a method to "merge"
the two different documents on commit of the second document.  I have a
testcase which explains what I'd like to happen:



@Test

   public void testMerge() throws SolrServerException, IOException

   {

 SolrInputDocument doc1 = new SolrInputDocument();

 doc1.addField("secid", "testid");

 doc1.addField("value1_i", 1);



 SolrAllSec.GetSolrServer().add(doc1);

 SolrAllSec.GetSolrServer().commit();



 SolrInputDocument doc2 = new SolrInputDocument();

 doc2.addField("secid", "testid");

 doc2.addField("value2_i", 2);



 SolrAllSec.GetSolrServer().add(doc2);

 SolrAllSec.GetSolrServer().commit();



 SolrQuery solrQuery = new  SolrQuery();

 solrQuery = solrQuery.setQuery("secid:testid");

 QueryResponse response =
SolrAllSec.GetSolrServer().query(solrQuery, METHOD.GET);



 List  result = response.getResults();

 Assert.isTrue(result.size() == 1);

 Assert.isTrue(result.contains("value1"));

 Assert.isTrue(result.contains("value2"));

   }



Other than reading "doc1" and adding the fields from "doc2" and
recommitting, is there another way?



Thanks in advance,



Tim

Re: FilterQuery OR statement

2011-03-03 Thread Ahmet Arslan


--- On Thu, 3/3/11, Ahmet Arslan  wrote:

> From: Ahmet Arslan 
> Subject: Re: FilterQuery OR statement
> To: solr-user@lucene.apache.org
> Date: Thursday, March 3, 2011, 8:05 PM
> > Trying to figure out how I can
> run
> > something similar to this for the fq
> > parameter
> > 
> > Field1 in ( 1, 2, 3 4 )
> > AND
> > Field2 in ( 4, 5, 6, 7 )
> > 
> > I found some examples on the net that looked like
> this:
> > &fq=+field1:(1 2 3
> > 4) +field2(4 5 6 7) but that yields no results.
> 
> May be your default operator is set to AND in schema.xml?
> If yes, try using +field2(4 OR 5 OR 6 OR 7) 

Actually you can use local params for that.
http://wiki.apache.org/solr/LocalParams

&fq={!q.op=OR df=field1}1 2 3 4&fq={!q.op=OR df=field2}4 5 6 7

Re: FilterQuery OR statement

2011-03-03 Thread Tanner Postert

That worked, thought I tried it before, not sure why it didn't before.

Also, is there a way to query without a q parameter?

I'm just trying to pull back all of the field results where field1:(1 OR 2
OR 3) etc. so I figured I'd use the FQ param for caching purposes because
those queries will likely be run a lot, but if I leave the Q parameter off i
get a null pointer error.

On Thu, Mar 3, 2011 at 11:05 AM, Ahmet Arslan  wrote:

> > Trying to figure out how I can run
> > something similar to this for the fq
> > parameter
> >
> > Field1 in ( 1, 2, 3 4 )
> > AND
> > Field2 in ( 4, 5, 6, 7 )
> >
> > I found some examples on the net that looked like this:
> > &fq=+field1:(1 2 3
> > 4) +field2(4 5 6 7) but that yields no results.
>
> May be your default operator is set to AND in schema.xml?
> If yes, try using +field2(4 OR 5 OR 6 OR 7)
>
>
>
>

Location of Main Class in Solr?

2011-03-03 Thread Anurag

I searched SolrIndexSearcher.java file but there is no main class.  I wanted
to know as to where this class resides. Can i call this main class (if it
exists)  using command line options in terminal , rather than through war
file?

-
Kumar Anurag

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Location-of-Main-Class-in-Solr-tp2627576p2627576.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Admin Interface, reworked - Go on? Go away?

2011-03-03 Thread Stefan Matheis


Am 02.03.2011 23:48, schrieb Robert Muir:

On Wed, Mar 2, 2011 at 5:34 PM, Stefan Matheis
  wrote:

Robert,

even in this WIP-State? if so .. i'll try one tomorrow evening after work



Its totally up to you, sometimes it can be useful to upload a partial
or WIP solution to an issue: as Hoss mentioned its a good way to get
feedback and additional ideas while you work.


There you go :) https://issues.apache.org/jira/browse/SOLR-2399

mixing version of solr

2011-03-03 Thread Ofer Fort

Hey all,
I have a master slave using the same index folder, the master only writes,
and the slave only reads.
Is it possible to use different versions of solr for those two servers?
Let's say i want to gain from the improved search speed of solr4.0 but since
it's my production system, am not willing to index using it since it's not a
stable release.
Since the slave only reads, if it will crash i'll just restart it.

Can i index using solr 1.4.1 and read the same index with solr 4.0?

thanks

Re: mixing version of solr

2011-03-03 Thread Frederik Kraus

No, that won't work as the index format has changed.
On Donnerstag, 3. März 2011 at 20:03, Ofer Fort wrote: 
> Hey all,
> I have a master slave using the same index folder, the master only writes,
> and the slave only reads.
> Is it possible to use different versions of solr for those two servers?
> Let's say i want to gain from the improved search speed of solr4.0 but since
> it's my production system, am not willing to index using it since it's not a
> stable release.
> Since the slave only reads, if it will crash i'll just restart it.
> 
> Can i index using solr 1.4.1 and read the same index with solr 4.0?
> 
> thanks
>

Limiting on dates in Solr

2011-03-03 Thread Steve Lewis

I am treating Solr as a NoSQL db that has great search capabilities. I am 
querying on a few fields:

1. text (default)
2. type (my own string field)
3. calibration (my own date field)

I'd like to limit the results to only show the calibration using this query:

calibration:[2011-03-03T00:00:00.000Z TO 2011-03-03T59:59:99.999Z]

This mostly works, but a couple of different dates (March 5) seep into the 
March 
3rd results. Is there any way to exclude the other dates, or at least have them 
return a lower ranking in the search? I've also tried:

calibration:[2011-03-03T00:00:00.000Z TO 2011-03-03T59:59:99.999Z]  AND NOT ( 
calibration:[* TO 2011-03-03T00:00:00.000Z] OR 
calibration:[2011-03-03T59:59:99.999Z TO *])

Which I found suggested on the stackoverflow web site. I've googled a good bit 
and nothing seems to be jumping out at me. No one else appears to be trying to 
do something similar, so I may just have unrealistic expectations of what a 
search engine will do.

Thanks in advance!
Steve

Re: FilterQuery OR statement

2011-03-03 Thread Jonathan Rochkind

You might also consider splitting your two seperate "AND" clauses into 
two seperate fq's:


&fq=field1:(1 OR 2 OR 3 OR 4)
&fq=field2:(4 OR 5 OR 6 OR 7)

That will cache the two seperate clauses seperately in the field cache, 
which is probably preferable in general, without knowing more about your 
use characteristics.


ALSO, instead of either supplying the "OR" explicitly as above, OR 
changing the default operator in schema.xml for everything, I believe it 
would work to supply it as a local param:


&fq={q.op=OR}field1:(1 2 3 4)

If you want to do that.

AND, your question, can you search without a 'q'?  No, but you can 
search with a 'q' that selects all documents, to be limited by the fq's.


q=[* TO *]

On 3/3/2011 1:14 PM, Tanner Postert wrote:

That worked, thought I tried it before, not sure why it didn't before.

Also, is there a way to query without a q parameter?

I'm just trying to pull back all of the field results where field1:(1 OR 2
OR 3) etc. so I figured I'd use the FQ param for caching purposes because
those queries will likely be run a lot, but if I leave the Q parameter off i
get a null pointer error.

On Thu, Mar 3, 2011 at 11:05 AM, Ahmet Arslan  wrote:


Trying to figure out how I can run
something similar to this for the fq
parameter

Field1 in ( 1, 2, 3 4 )
AND
Field2 in ( 4, 5, 6, 7 )

I found some examples on the net that looked like this:
&fq=+field1:(1 2 3
4) +field2(4 5 6 7) but that yields no results.

May be your default operator is set to AND in schema.xml?
If yes, try using +field2(4 OR 5 OR 6 OR 7)

Re: mixing version of solr

2011-03-03 Thread Jonathan Rochkind

In general, no. I think there are index format changes between 1.4.1 and 
4.0.


If the two versions of Solr have the exact same index formats, it would 
theoretically work, but you'd need to figure that out and be sure of it, 
any two arbitrary versions of Solr/lucene may or may not have the exact 
same index formats. _Maybe_ 4.0 can read a 1.4.1 index.  In some cases I 
think it's supposed to be able to. But it all starts getting confusing 
and with edge cases where things don't quite work, I personally wouldn't 
try it.


But personally, I don't like the idea of having two running instances of 
Solr using the exact same on-disk index anyway.  I know people do it, 
you aren't alone, but it makes me nervous, seems like asking for 
trouble. When the indexing instances writes new indexes, when and how is 
the read-only Solr going to figure that out and load new searchers for 
it?  It just gets confusing and complicated.




On 3/3/2011 2:03 PM, Ofer Fort wrote:

Hey all,
I have a master slave using the same index folder, the master only writes,
and the slave only reads.
Is it possible to use different versions of solr for those two servers?
Let's say i want to gain from the improved search speed of solr4.0 but since
it's my production system, am not willing to index using it since it's not a
stable release.
Since the slave only reads, if it will crash i'll just restart it.

Can i index using solr 1.4.1 and read the same index with solr 4.0?

thanks

Re: mixing version of solr

2011-03-03 Thread Ofer Fort

we've been running like this for almost six months now and it's working ok.
We have a post-commit event on the "master" that executes a commit call on
the "slave", this forces the slave to reload the index.

We started with a "standard" master/slave replication, but we had a few
times that the slave got and OOM and it caused a 100% CPU on the master
itself, restart to both didn't help, and we had to shotdown the both, copy
the files from the master to the slave, and continue the replication.
Since we couldn't resolve this issue we moved to this configuration.

Is anybody here working with solr4.0 in production? feels risky...

On Thu, Mar 3, 2011 at 9:31 PM, Jonathan Rochkind  wrote:

> In general, no. I think there are index format changes between 1.4.1 and
> 4.0.
>
> If the two versions of Solr have the exact same index formats, it would
> theoretically work, but you'd need to figure that out and be sure of it, any
> two arbitrary versions of Solr/lucene may or may not have the exact same
> index formats. _Maybe_ 4.0 can read a 1.4.1 index.  In some cases I think
> it's supposed to be able to. But it all starts getting confusing and with
> edge cases where things don't quite work, I personally wouldn't try it.
>
> But personally, I don't like the idea of having two running instances of
> Solr using the exact same on-disk index anyway.  I know people do it, you
> aren't alone, but it makes me nervous, seems like asking for trouble. When
> the indexing instances writes new indexes, when and how is the read-only
> Solr going to figure that out and load new searchers for it?  It just gets
> confusing and complicated.
>
>
>
>
> On 3/3/2011 2:03 PM, Ofer Fort wrote:
>
>> Hey all,
>> I have a master slave using the same index folder, the master only writes,
>> and the slave only reads.
>> Is it possible to use different versions of solr for those two servers?
>> Let's say i want to gain from the improved search speed of solr4.0 but
>> since
>> it's my production system, am not willing to index using it since it's not
>> a
>> stable release.
>> Since the slave only reads, if it will crash i'll just restart it.
>>
>> Can i index using solr 1.4.1 and read the same index with solr 4.0?
>>
>> thanks
>>
>>

Fwd: [Announce] Now Open: Call for Participation for ApacheCon North America

2011-03-03 Thread Grant Ingersoll



Begin forwarded message:

> From: Sally Khudairi 
> Date: March 3, 2011 3:10:17 PM EST
> To: annou...@apachecon.com
> Subject: [Announce] Now Open: Call for Participation for ApacheCon North 
> America
> Reply-To: s...@apache.org
> 
> Call for Participation 
> ApacheCon North America 2011 
> 7-11 November 2011 
> Westin Bayshore, Vancouver, Canada 
> 
> All submissions must be received by Friday, 29 April 2011 at midnight Pacific 
> Time. 
> 
> ApacheCon, the official conference, trainings, and expo of The Apache 
> Software Foundation (ASF), heads to Vancouver, Canada, this November, with 
> dozens of technical, business, and community-focused sessions for beginner, 
> intermediate, and expert audiences. 
> 
> Now in its 11th year, the ASF develops and shepherds nearly 150 Top-Level 
> Projects and new initiatives in the Apache Incubator and Labs. With hundreds 
> of thousands of applications deploying ASF products and code contributions by 
> more than 2,500 Committers from around the world, the Apache community is 
> recognized as among the most robust, successful, and respected in Open 
> Source. 
> 
> This year's ApacheCon focuses on highly-relevant, professionally-directed 
> presentations that demonstrate specific problems and real-world solutions. We 
> welcome proposals --from developers and users alike-- in the areas of "Apache 
> and ...": 
> 
> ... Enterprise Solutions (from ActiveMQ to Axis2 to ServiceMix, OFBiz to 
> Chemistry, the gang's all here!) 
> 
> ... Cloud Computing (Hadoop, Cassandra, HBase, CouchDB, and friends) 
> 
> ... Emerging Technologies + Innovation (Incubating projects such as Libcloud, 
> Stonehenge, and Wookie) 
> 
> ... Community Leadership (mentoring and meritocracy, GSoC and related 
> initiatives) 
> 
> ... Data Handling, Search + Analytics (Lucene, Solr, Mahout, OODT, Hive and 
> friends) 
> 
> ... Pervasive Computing (Felix/OSGi, Tomcat, MyFaces Trinidad, and friends) 
> 
> ... Servers, Infrastructure + Tools (HTTP Server, SpamAssassin, Geronimo, 
> Sling, Wicket and friends) 
> 
> 
> Submissions are open to anyone with relevant expertise: ASF affiliation is 
> not required to present at, attend, or otherwise participate in ApacheCon. 
> 
> Whilst we encourage submissions that the highlight the use of specific Apache 
> solutions, we are unable to accept marketing/commercially-oriented 
> presentations. 
> 
> Other proposals, such as panels, have been considered in the past; you are 
> welcome to submit an alternate presentation, however, such sessions are 
> accepted under exceptional circumstances. Please be as descriptive as 
> possible, including names/bios of proposed panelists and any related details. 
> 
> Accepted speakers (not co-presenters) qualify for general conference 
> admission and a minimum of two nights lodging at the conference hotel. 
> Additional hotel nights and travel assistance are possible, depending on the 
> number of presentations given and type of assistance needed. 
> 
> To submit a presentation proposal, please complete our ONLINE SUBMISSION FORM 
> at http://na11.apachecon.com/proposals/new 
> 
> To be considered, proposals must be received by Friday, 29 April 2011 at 
> midnight Pacific Time. Please email any questions regarding proposal 
> submissions to cfp AT apachecon DOT com. 
> 
> Key Dates:
> 
> 3 March 2011 - CFP Opens 
> 29 April 2011 - CFP Closes 
> 20 May-30 June 2011 - Speaker Notifications and Confirmations 
> 7-11 November 2011 - ApacheCon NA 2011 
> 
> 
> We look forward to seeing you in Vancouver! 
> 
> – The ApacheCon Planning team 
> 
> 
> 
> 
> -
> To unsubscribe, e-mail: announce-unsubscr...@apachecon.com
> For additional commands, e-mail: announce-h...@apachecon.com
>

Fwd: [Announce] Now Open: Call for Participation for ApacheCon North America

2011-03-03 Thread Grant Ingersoll



Begin forwarded message:

> From: Grant Ingersoll 
> Date: March 3, 2011 3:52:05 PM EST
> To: u...@mahout.apache.org, solr-user@lucene.apache.org, 
> java-u...@lucene.apache.org, opennlp-u...@incubator.apache.org
> Subject: Fwd: [Announce] Now Open: Call for Participation for ApacheCon North 
> America
> 
> 
> 
> Begin forwarded message:
> 
>> From: Sally Khudairi 
>> Date: March 3, 2011 3:10:17 PM EST
>> To: annou...@apachecon.com
>> Subject: [Announce] Now Open: Call for Participation for ApacheCon North 
>> America
>> Reply-To: s...@apache.org
>> 
>> Call for Participation 
>> ApacheCon North America 2011 
>> 7-11 November 2011 
>> Westin Bayshore, Vancouver, Canada 
>> 
>> All submissions must be received by Friday, 29 April 2011 at midnight 
>> Pacific Time. 
>> 
>> ApacheCon, the official conference, trainings, and expo of The Apache 
>> Software Foundation (ASF), heads to Vancouver, Canada, this November, with 
>> dozens of technical, business, and community-focused sessions for beginner, 
>> intermediate, and expert audiences. 
>> 
>> Now in its 11th year, the ASF develops and shepherds nearly 150 Top-Level 
>> Projects and new initiatives in the Apache Incubator and Labs. With hundreds 
>> of thousands of applications deploying ASF products and code contributions 
>> by more than 2,500 Committers from around the world, the Apache community is 
>> recognized as among the most robust, successful, and respected in Open 
>> Source. 
>> 
>> This year's ApacheCon focuses on highly-relevant, professionally-directed 
>> presentations that demonstrate specific problems and real-world solutions. 
>> We welcome proposals --from developers and users alike-- in the areas of 
>> "Apache and ...": 
>> 
>> ... Enterprise Solutions (from ActiveMQ to Axis2 to ServiceMix, OFBiz to 
>> Chemistry, the gang's all here!) 
>> 
>> ... Cloud Computing (Hadoop, Cassandra, HBase, CouchDB, and friends) 
>> 
>> ... Emerging Technologies + Innovation (Incubating projects such as 
>> Libcloud, Stonehenge, and Wookie) 
>> 
>> ... Community Leadership (mentoring and meritocracy, GSoC and related 
>> initiatives) 
>> 
>> ... Data Handling, Search + Analytics (Lucene, Solr, Mahout, OODT, Hive and 
>> friends) 
>> 
>> ... Pervasive Computing (Felix/OSGi, Tomcat, MyFaces Trinidad, and friends) 
>> 
>> ... Servers, Infrastructure + Tools (HTTP Server, SpamAssassin, Geronimo, 
>> Sling, Wicket and friends) 
>> 
>> 
>> Submissions are open to anyone with relevant expertise: ASF affiliation is 
>> not required to present at, attend, or otherwise participate in ApacheCon. 
>> 
>> Whilst we encourage submissions that the highlight the use of specific 
>> Apache solutions, we are unable to accept marketing/commercially-oriented 
>> presentations. 
>> 
>> Other proposals, such as panels, have been considered in the past; you are 
>> welcome to submit an alternate presentation, however, such sessions are 
>> accepted under exceptional circumstances. Please be as descriptive as 
>> possible, including names/bios of proposed panelists and any related 
>> details. 
>> 
>> Accepted speakers (not co-presenters) qualify for general conference 
>> admission and a minimum of two nights lodging at the conference hotel. 
>> Additional hotel nights and travel assistance are possible, depending on the 
>> number of presentations given and type of assistance needed. 
>> 
>> To submit a presentation proposal, please complete our ONLINE SUBMISSION 
>> FORM at http://na11.apachecon.com/proposals/new 
>> 
>> To be considered, proposals must be received by Friday, 29 April 2011 at 
>> midnight Pacific Time. Please email any questions regarding proposal 
>> submissions to cfp AT apachecon DOT com. 
>> 
>> Key Dates:
>> 
>> 3 March 2011 - CFP Opens 
>> 29 April 2011 - CFP Closes 
>> 20 May-30 June 2011 - Speaker Notifications and Confirmations 
>> 7-11 November 2011 - ApacheCon NA 2011 
>> 
>> 
>> We look forward to seeing you in Vancouver! 
>> 
>> – The ApacheCon Planning team 
>> 
>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: announce-unsubscr...@apachecon.com
>> For additional commands, e-mail: announce-h...@apachecon.com
>> 
> 

--
Grant Ingersoll
http://www.lucidimagination.com

Re: Limiting on dates in Solr

2011-03-03 Thread Andreas Kemkes

2011-03-03T59:59:99.999Z - shouldn't that be 2011-03-03T23:59:59.999Z

From: Steve Lewis 
To: solr-user@lucene.apache.org
Sent: Thu, March 3, 2011 11:21:53 AM
Subject: Limiting on dates in Solr

I am treating Solr as a NoSQL db that has great search capabilities. I am 
querying on a few fields:

1. text (default)
2. type (my own string field)
3. calibration (my own date field)

I'd like to limit the results to only show the calibration using this query:

calibration:[2011-03-03T00:00:00.000Z TO 2011-03-03T59:59:99.999Z]

This mostly works, but a couple of different dates (March 5) seep into the 
March 

3rd results. Is there any way to exclude the other dates, or at least have them 
return a lower ranking in the search? I've also tried:

calibration:[2011-03-03T00:00:00.000Z TO 2011-03-03T59:59:99.999Z]  AND NOT ( 
calibration:[* TO 2011-03-03T00:00:00.000Z] OR 
calibration:[2011-03-03T59:59:99.999Z TO *])

Which I found suggested on the stackoverflow web site. I've googled a good bit 
and nothing seems to be jumping out at me. No one else appears to be trying to 
do something similar, so I may just have unrealistic expectations of what a 
search engine will do.

Thanks in advance!
Steve

Re: Limiting on dates in Solr

2011-03-03 Thread Steve Lewis

Ugh. Of course. I fixed that a couple weeks ago, something must have crept back 
in!
Thanks a mil!

From: Andreas Kemkes 
To: solr-user@lucene.apache.org
Sent: Thu, March 3, 2011 4:12:02 PM
Subject: Re: Limiting on dates in Solr

2011-03-03T59:59:99.999Z - shouldn't that be 2011-03-03T23:59:59.999Z

From: Steve Lewis 
To: solr-user@lucene.apache.org
Sent: Thu, March 3, 2011 11:21:53 AM
Subject: Limiting on dates in Solr

I am treating Solr as a NoSQL db that has great search capabilities. I am 
querying on a few fields:

1. text (default)
2. type (my own string field)
3. calibration (my own date field)

I'd like to limit the results to only show the calibration using this query:

calibration:[2011-03-03T00:00:00.000Z TO 2011-03-03T59:59:99.999Z]

This mostly works, but a couple of different dates (March 5) seep into the 
March 

3rd results. Is there any way to exclude the other dates, or at least have them 
return a lower ranking in the search? I've also tried:

calibration:[2011-03-03T00:00:00.000Z TO 2011-03-03T59:59:99.999Z]  AND NOT ( 
calibration:[* TO 2011-03-03T00:00:00.000Z] OR 
calibration:[2011-03-03T59:59:99.999Z TO *])

Which I found suggested on the stackoverflow web site. I've googled a good bit 
and nothing seems to be jumping out at me. No one else appears to be trying to 
do something similar, so I may just have unrealistic expectations of what a 
search engine will do.

Thanks in advance!
Steve

Out of memory while creating indexes

2011-03-03 Thread Solr User

Hi All,

I am trying to create indexes out of a 400MB XML file using the following
command and I am running into out of memory exception.

$JAVA_HOME/bin/java -Xms768m -Xmx1024m -*Durl*=http://$SOLR_HOST
SOLR_PORT/solr/customercarecore/update -jar
$SOLRBASEDIR/*dataconvertor*/common/lib/post.jar
$SOLRBASEDIR/dataconvertor/customercare/xml/CustomerData.xml

I am planning to bump up the memory and try again.

Did any one ran into similar issue? Any inputs would be very helpful to
resolve the out of memory exception.

I was able to create indexes with small file but not with large file. I am
not using Solr J.

Thanks,
Solr User

Max Document Size

2011-03-03 Thread Sean Todd

Is there a maximum document size that Solr can handle?  I'm trying to index
documents greater than 15MB, but every time I do I get a random error.  One
of the other problems with what I'm documenting is that they are not in a
human language.  They are EDI documents (EDI is a B2B communication system
that is similar in format to iCal formatted documents) and don't have many
traditional word breaks but do have segment and element character breaks.  I
tried playing with the maxFieldLength parameter, but that doesn't seem to be
helping (and, yes, I changed it in both places in the SolrConfig.xml).

Has anyone had any similar problems with Solr?
*
Sean Todd*
Senior Software Developer
EDI Technical Operations
Build.com, Inc.  
Smarter Home Improvement™
P.O. Box 7990 Chico, CA 95927
*P*: 800.375.3403 x534
*F*: 530.566.1893
st...@build.com | Network of
Stores

Re: Out of memory while creating indexes

2011-03-03 Thread Gora Mohanty

On Fri, Mar 4, 2011 at 3:32 AM, Solr User  wrote:
> Hi All,
>
> I am trying to create indexes out of a 400MB XML file using the following
> command and I am running into out of memory exception.

Is this a single record in the XML file? If it is more than one, breaking
it up into separate XML files, say one per record, should help.

> $JAVA_HOME/bin/java -Xms768m -Xmx1024m -*Durl*=http://$SOLR_HOST
> SOLR_PORT/solr/customercarecore/update -jar
> $SOLRBASEDIR/*dataconvertor*/common/lib/post.jar
> $SOLRBASEDIR/dataconvertor/customercare/xml/CustomerData.xml
>
> I am planning to bump up the memory and try again.
[...]

If you give Solr enough memory this should work, but IMHO, it would
be better to break up your input XML files if you can.

Regards,
Gora

Model foreign key type of search?

2011-03-03 Thread Alex Dong

Hi there,  I need some advice on how to implement this using solr:

We have two tables: urls and bookmarks.
- Each url has four fields:  {guid, title, text, url}
- One url will have one or more bookmarks associated with it. Each bookmark
has these: {link.guid, user, tags, comment}

I'd like to return matched urls based on not only the "title, text" from the
url schema, but also some kind of aggregated popularity score based on all
"bookmarks" for the same url. The popularity score should base on
number/frequency of bookmarks that match the query.

For example, a search for "Paris".  Let's say 15 out of 1000 people has
bookmarked a tripadvisor.com page with Paris in tag or comments field;
 another 15 out of 20 people bookmarked
www.ratp.info/orienter/cv/carteparis.php with Paris in it.  I'd like to rank
the later one, ie the metro planner higher.

I am thinking of implementing org.apache.solr.search.ValueSourceParser which
takes a guid and run a "embedded query" to get a score for this guid in the
bookmark schema. This would probably requires two separated indexes to begin
with.

Keen to hear ideas on what's the best way to implement this and where I
should start.

Thanks,
Alex

Re: SolrJ Tutorial

2011-03-03 Thread Grijesh

It comes with every solr source code download directory under 

src/test

-
Thanx:
Grijesh
http://lucidimagination.com
--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrJ-Tutorial-tp2307113p2631223.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Model foreign key type of search?

2011-03-03 Thread Gora Mohanty

On Fri, Mar 4, 2011 at 10:24 AM, Alex Dong  wrote:
> Hi there,  I need some advice on how to implement this using solr:
>
> We have two tables: urls and bookmarks.
> - Each url has four fields:  {guid, title, text, url}
> - One url will have one or more bookmarks associated with it. Each bookmark
> has these: {link.guid, user, tags, comment}
>
> I'd like to return matched urls based on not only the "title, text" from the
> url schema, but also some kind of aggregated popularity score based on all
> "bookmarks" for the same url. The popularity score should base on
> number/frequency of bookmarks that match the query.
[...]

It is best not to think of Solr as a RDBMS, and not to try to graft
RDBMS practices on to it. Instead, you should flatten your data,
e.g., in the above, you could have:
* Four single-valued fields: guid, title, text, url
* Four multi-valued fields: bookmark_guid, bookmark_user,
  bookmark_tags, bookmark_comment
Your index would contain one record per guid of the URL,
and you would need to populate the multi-valued bookmark
fields from all bookmark instances associated with that URL.

Then one could either copy the relevant search fields to a full-text
search field, and search only on that, or, e.g., search on bookmark_tags
and bookmark_comment in addition to searching on title, and text.

Regards,
Gora

Re: Model foreign key type of search?

2011-03-03 Thread Alex Dong

Gora, thanks for the quick reply.

Yes, I'm aware of the differences between Solr vs. DBMS. We've actually
written some c++ analytical engine that can process through a billion tweets
with multiple facets drill down. We may end up cook our own in the end but
so far solr suites our needs quite well.  The multi-lingual tokenizer and
tika integration are all too addictive.

What you're suggesting is exactly what I'm doing. Trying to use dynamic
fields and copyTo to get all the information into one field, then run the
search over that.

However, this is not good enough.  Allow me to elaborate this using the same
Paris example again.  Let's say two urls, first has 10 people bookmarked and
second has 100. Let's say these two have roughly similar score if we squeeze
them into one single field. Then I'd like to rank the one with more users
higher.

Another way to look at this is PageRank relies on the the number and anchor
text of the incoming link, we're trying to use the number of people and
their keywords/comments as a weight for the link.

Alex

On Fri, Mar 4, 2011 at 6:29 PM, Gora Mohanty  wrote:

> On Fri, Mar 4, 2011 at 10:24 AM, Alex Dong  wrote:
> > Hi there,  I need some advice on how to implement this using solr:
> >
> > We have two tables: urls and bookmarks.
> > - Each url has four fields:  {guid, title, text, url}
> > - One url will have one or more bookmarks associated with it. Each
> bookmark
> > has these: {link.guid, user, tags, comment}
> >
> > I'd like to return matched urls based on not only the "title, text" from
> the
> > url schema, but also some kind of aggregated popularity score based on
> all
> > "bookmarks" for the same url. The popularity score should base on
> > number/frequency of bookmarks that match the query.
> [...]
>
> It is best not to think of Solr as a RDBMS, and not to try to graft
> RDBMS practices on to it. Instead, you should flatten your data,
> e.g., in the above, you could have:
> * Four single-valued fields: guid, title, text, url
> * Four multi-valued fields: bookmark_guid, bookmark_user,
>  bookmark_tags, bookmark_comment
> Your index would contain one record per guid of the URL,
> and you would need to populate the multi-valued bookmark
> fields from all bookmark instances associated with that URL.
>
> Then one could either copy the relevant search fields to a full-text
> search field, and search only on that, or, e.g., search on bookmark_tags
> and bookmark_comment in addition to searching on title, and text.
>
> Regards,
> Gora
>

Problem using solr 4.0 in java environment

2011-03-03 Thread Isha Garg


Hi,
 i am using fcaet.pivoy feature of solr4.0 it works well and shows 
result on browser. But when I used solr 4.0  in java i got following error


Exception in thread "main" java.lang.NoSuchMethodError: 
org.slf4j.spi.LocationAwareLogger.log(Lorg/slf4j/Marker;Ljava/lang/String;ILjava/lang/String;[Ljava/lang/Object;Ljava/lang/Throwable;)V
at 
org.apache.commons.logging.impl.SLF4JLocationAwareLog.trace(SLF4JLocationAwareLog.java:107)
at 
org.apache.commons.httpclient.methods.PostMethod.clearRequestBody(PostMethod.java:152)
at 
org.apache.commons.httpclient.methods.EntityEnclosingMethod.setRequestEntity(EntityEnclosingMethod.java:547)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:369)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:245)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)


I have very little knowledge of solr and java . Please help me out


Thanks!
Isha

Re: Content-Type of XMLResponseWriter / QueryResponseWriter

2011-03-03 Thread Bernd Fehling

Hi Walter,

many thanks!

Bernd

Am 03.03.2011 17:01, schrieb Walter Underwood:
> Never use text/xml, that overrides any encoding declaration inside the XML 
> file.
> 
> http://ln.hixie.ch/?start=1037398795&count=1
> http://www.grauw.nl/blog/entry/489
> 
> wunder
> ==
> Lead Engineer, MarkLogic
> 
> On Mar 3, 2011, at 7:30 AM, Bernd Fehling wrote:
> 
>> Dear list,
>>
>> is there any deeper logic behind the fact that XMLResponseWriter
>> is sending CONTENT_TYPE_XML_UTF8="application/xml; charset=UTF-8" ?
>>
>> I would assume (and also most browser) that for XML Output
>> to receive "text/xml" and not "application/xml".
>>
>> Or do you want the browser to call and XML-Editor with the result?
>>
>> Best regards, Bernd

Re: Out of memory while creating indexes

2011-03-03 Thread Upayavira

post.jar is intended for demo purposes, not production use, so it
doesn;t surprise me you've managed to break it.

Have you tried using curl to do the post?

Upayavira

On Thu, 03 Mar 2011 17:02 -0500, "Solr User"  wrote:
> Hi All,
> 
> I am trying to create indexes out of a 400MB XML file using the following
> command and I am running into out of memory exception.
> 
> $JAVA_HOME/bin/java -Xms768m -Xmx1024m -*Durl*=http://$SOLR_HOST
> SOLR_PORT/solr/customercarecore/update -jar
> $SOLRBASEDIR/*dataconvertor*/common/lib/post.jar
> $SOLRBASEDIR/dataconvertor/customercare/xml/CustomerData.xml
> 
> I am planning to bump up the memory and try again.
> 
> Did any one ran into similar issue? Any inputs would be very helpful to
> resolve the out of memory exception.
> 
> I was able to create indexes with small file but not with large file. I
> am
> not using Solr J.
> 
> Thanks,
> Solr User
> 
--- 
Enterprise Search Consultant at Sourcesense UK, 
Making Sense of Open Source

67 matches

Mail list logo