Random Display of result in solr

2009-09-10 Thread dharhsana

Hi to all, 

I have an issue while i am working with solr. 

I am working on blog module,here the user will be creating more blogs,and he
can post on it and have several comments for post.For implementing this
module i am using solr 1.4. 

When i get blog details of particular user, it brings the result in random
manner for 
(ex:) If i am passing blogid in the query to get my details,the result i got
as ,if i have 2 result from it 

This is the first result

SolrDocument1{blogTitle=New Blog, blogId=New Blog, userId=1}] 
SolrDocument2{blogId=New Blog, postId=New Post, postTitle=New Post,
postMessage=New Post Message, timestamp_post=Fri Sep 11 09:48:24 IST 2009}] 
SolrDocument3{blogTitle=ammu blog, blogId=ammu blog, userId=1}] 

The Second result 
SolrDocument1{blogTitle=New Blog, blogId=New Blog, userId=1}] 
SolrDocument2{blogTitle=ammu blog, blogId=ammu blog, userId=1}] 
SolrDocument3{blogId=New Blog, postId=New Post, postTitle=New Post,
postMessage=New Post Message, timestamp_post=Fri Sep 11 09:48:24 IST 2009}] 

I am using solrj, when i am iterating the list i some times get
ArrayIndexOutOfBoundException,because of my difference in the result. 

When i run again my code some other time ,it produces the proper result.so
the list was changing all time. 

If anybody faced this type of problem ,please share with me.. 

And  iam not able to get the specific thing ie if i am going to get blog
details of particular user, so i will be passing blogtitle for ex: rekha
blog , it is not giving only the rekha blog it also gives other blog which
ends with blog (i..e sandhya blog,it brings even that and shows..). 

what should i do for this ,any specific query should be given ,i am using
solrj.. using this how to make my query to get my prompt result. 

Waiting for your reply 

Regards, 

Rekha. 
-- 
View this message in context: 
http://www.nabble.com/Random-Display-of-result-in-solr-tp25395746p25395746.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Backups using Replication

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
ok this was committed on July 15, 2009

before that backupAfter was called "snapshot"

On Thu, Sep 10, 2009 at 10:14 PM, wojtekpia  wrote:
>
> I'm using trunk from July 8, 2009. Do you know if it's more recent than that?
>
>
> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>
>> which version of Solr are you using? the "backupAfter" name was
>> introduced recently
>>
>
> --
> View this message in context: 
> http://www.nabble.com/Backups-using-Replication-tp25350083p25386886.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


What Tokenizerfactory/TokenFilterFactory can/should I use so a search for "wal mart" matches "walmart"(quotes not included in search or index)?

2009-09-10 Thread Christian Zambrano
There are a lot of company names that people are uncertain as to the 
correct spelling. A few of examples are:

1. best buy, bestbuy
2. walmart, wal mart, wal-mart
3. Holiday Inn, HolidayInn

What Tokenizer Factory and/or TokenFilterFactory should I use so that 
somebody typing "wal mart"(quotes not included) will find "wal mart" and 
"walmart"(again, quotes not included)


Thanks,

Christian


Re: Default Query Type For Facet Queries

2009-09-10 Thread Lance Norskog
Changing basic defaults like this makes it very confusing to work with
successive solr releases, to read the wiki, etc.

You can make custom search requesthandlers - an example:

 
    
    customparser

http://localhost:8983/solr/custom?q=string_in_my_custom_language

On 9/10/09, Stephen Duncan Jr  wrote:
> If using {!type=customparser} is the only way now, should I file an issue to
> make the default configurable?
>
> --
> Stephen Duncan Jr
> www.stephenduncanjr.com
>
> On Thu, Sep 3, 2009 at 11:23 AM, Stephen Duncan Jr  > wrote:
>
> > We have a custom query parser plugin registered as the default for
> > searches, and we'd like to have the same parser used for facet.query.
> >
> > Is there a way to register it as the default for FacetComponent in
> > solrconfig.xml?
> >
> > I know I can add {!type=customparser} to each query as a workaround, but
> > I'd rather register it in the config that make my code send that and strip
> > it off on every facet query.
> >
> > --
> > Stephen Duncan Jr
> > www.stephenduncanjr.com
> >
>



-- 
Lance Norskog
goks...@gmail.com


Re: Extract info from parent node during data import

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Fri, Sep 11, 2009 at 6:48 AM, venn hardy  wrote:
>
> Hi Fergus,
>
> When I debugged in the development console 
> http://localhost:9080/solr/admin/dataimport.jsp?handler=/dataimport
>
> I had no problems. Each category/item seems to be only indexed once, and no 
> parent fields are available (except the category name).
>
> I am not entirely sure how the forEach statement works, but my interpretation 
> of forEach="/document/category/item | /document/category" is something like 
> this:
>
> 1. Whenever DIH encounters a document/category it will extract the 
> /document/category/
>
> name field as a common field
> 2. Whenever DIH encounters a document/category/item it will extract all of 
> the item fields.
> 3. When all fields have been encountered, save the document in solr and go to 
> the next category/item

/document/category/item | /document/category

means there are two paths which triggers a new doc (it is possible to
have more). Whenever it encounters the closing tag of that xpath , it
emits all the fields it collected since the opening of the same tag.
after that it clears all the fields it collected since the opening of
the tag.

If there are fields it collected before opening of the same tag, it retains it



>
>
>> Date: Thu, 10 Sep 2009 14:19:31 +0100
>> To: solr-user@lucene.apache.org
>> From: fer...@twig.me.uk
>> Subject: RE: Extract info from parent node during data import
>>
>> >Hi Paul,
>> >The forEach="/document/category/item | /document/category/name" didn't work 
>> >(no categoryname was stored or indexed).
>> >However forEach="/document/category/item | /document/category" seems to 
>> >work well. I am not sure why category on its own works, but not 
>> >category/name...
>> >But thanks for tip. It wasn't as painful as I thought it would be.
>> >Venn
>>
>> Hmmm, I had bother with this. Although each occurance of 
>> /document/category/item
>> causes a new solr document to indexed, that document contained all the 
>> fields from
>> the parent element as well.
>>
>> Did you see this?
>>
>> >
>> >> From: noble.p...@corp.aol.com
>> >> Date: Thu, 10 Sep 2009 09:58:21 +0530
>> >> Subject: Re: Extract info from parent node during data import
>> >> To: solr-user@lucene.apache.org
>> >>
>> >> try this
>> >>
>> >> add two xpaths in your forEach
>> >>
>> >> forEach="/document/category/item | /document/category/name"
>> >>
>> >> and add a field as follows
>> >>
>> >> > >> commonField="true"/>
>> >>
>> >> Please try it out and let me know.
>> >>
>> >> On Thu, Sep 10, 2009 at 7:30 AM, venn hardy  
>> >> wrote:
>> >> >
>> >> > Hello,
>> >> >
>> >> >
>> >> >
>> >> > I am using SOLR 1.4 (from nighly build) and its URLDataSource in 
>> >> > conjunction with the XPathEntityProcessor. I have successfully imported 
>> >> > XML content, but I think I may have found a limitation when it comes to 
>> >> > the commonField attribute in the DataImportHandler.
>> >> >
>> >> >
>> >> >
>> >> > Before writing my own parser to read in a whole XML document, I thought 
>> >> > I'd post the question here (since I got some great advice last time).
>> >> >
>> >> >
>> >> >
>> >> > The bulk of my content is contained within each  tag. However, 
>> >> > each item has a parent called  and each category has a name 
>> >> > which I would like to import. In my forEach loop I specify the 
>> >> > /document/category/item as the collection of items I am interested in. 
>> >> > Is there anyway to extract an element from underneath a parent node? To 
>> >> > be a more more specific (see eg xml below). I would like to index the 
>> >> > following:
>> >> >
>> >> > - category: Category 1; id: 1; author: Author 1
>> >> >
>> >> > - category: Category 1; id: 2; author: Author 2
>> >> >
>> >> > - category: Category 2; id: 3; author: Author 3
>> >> >
>> >> > - category: Category 2; id: 4; author: Author 4
>> >> >
>> >> >
>> >> >
>> >> > Any ideas on how I can get to a parent node from within a child during 
>> >> > data import? If it cant be done, what do you suggest would be the best 
>> >> > way so I can keep using the DataImportHandler... would XSLT be a good 
>> >> > idea to 'flatten out' the structure a bit?
>> >> >
>> >> >
>> >> >
>> >> > Thanks
>> >> >
>> >> >
>> >> >
>> >> > This is what my XML document looks like:
>> >> >
>> >> > 
>> >> > 
>> >> > Category 1
>> >> > 
>> >> > 1
>> >> > Author 1
>> >> > 
>> >> > 
>> >> > 2
>> >> > Author 2
>> >> > 
>> >> > 
>> >> > 
>> >> > Category 2
>> >> > 
>> >> > 3
>> >> > Author 3
>> >> > 
>> >> > 
>> >> > 4
>> >> > Author 4
>> >> > 
>> >> > 
>> >> > 
>> >> >
>> >> >
>> >> >
>> >> > And this is what my dataConfig looks like:
>> >> > 
>> >> > 
>> >> > 
>> >> > > >> > url="http://localhost:9080/data/20090817070752.xml"; 
>> >> > processor="XPathEntityProcessor" forEach="/document/category/item" 
>> >> > transformer="DateFormatTransformer" stream="true" 
>> >> > dataSource="dataSource">
>> >> > > >> > commonField="true" />
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> >
>> >> >
>> >> >
>> >> 

Re: Facet fields and the DisMax query handler

2009-09-10 Thread Lance Norskog
Facets are not involved here. These are only simple searches.

The DisMax parser does not use field names in the query. DisMax creates a
nice simple syntax for people to type into a web browser search field. The
various parameters let you sculpt the relevance in order to tune the user
experience.

There are ways to intermix dismax parsing in the standard query parser
syntax, but I am no expert. You can also use these field queries as filter
queries; this is a hack but does work. Also, using wildcards interferes with
upper/lower case handling.

On 9/10/09, Villemos, Gert  wrote:
>
> I'm trying to understand the DisMax query handler. I orginally
> configured it to ensure that the query was mapped onto different fields
> in the documents and a boost assigned if the fields match. And that
> works pretty smoothly.
>
> However when it comes to facetted searches the results perplexes me.
> Consider the following example;
>
> Document A:
>John Doe
>
> Document B:
>John Doe
>
> The following queries does not return anything;
>Staff:Doe
>Staff:Doe*
>Staff:John
>Staff:John*
>
> The query;
>Staff:"John"
>
> Returns Document A and B, even though document B doesnt even contain the
> field 'Staff' (which is optional)! Through the "qf" field dismax has
> been configured to search over the field 'ProjectManager' but I expected
> the usage of a facet value would exclude the field... Looking at the
> score of the documents, document A does score much higher than Document
> B (a factor 20) but I would expect not to see B at all. I have changed
> the dismax configuration minimum match to be 1, to ensure that all hits
> with a single match is returned without effect. I have changed the tie
> to 0 with no effect.
>
> What am I missing here? I would like queries such as 'Staff:Doe' to
> return document A, and only A.
>
> Cheers,
> Gert.
>
>
>
> Please help Logica to respect the environment by not printing this email  /
> Pour contribuer comme Logica au respect de l'environnement, merci de ne pas
> imprimer ce mail /  Bitte drucken Sie diese Nachricht nicht aus und helfen
> Sie so Logica dabei, die Umwelt zu schützen. /  Por favor ajude a Logica a
> respeitar o ambiente nao imprimindo este correio electronico.
>
>
>
> This e-mail and any attachment is for authorised use by the intended
> recipient(s) only. It may contain proprietary material, confidential
> information and/or be subject to legal privilege. It should not be copied,
> disclosed to, retained or used by, any other party. If you are not an
> intended recipient then please promptly delete this e-mail and any
> attachment and all copies and inform the sender. Thank you.
>
>


-- 
Lance Norskog
goks...@gmail.com


Re: Re : Indexing fields dynamically

2009-09-10 Thread Lance Norskog
In the schema.xml file, "*_i" is defined as a wildcard type for integer.
If a name-value pair is an integer, use: name_i as the field name.



On 9/10/09, nourredine khadri  wrote:
>
> Thanks for the quick reply.
>
> Ok for dynamicFields but  how can i rename fields during indexation/search
> to add suffix corresponding to the type ?
>
> What is the best way to do this?
>
> Nourredine.
>
>
>
>
> 
> De : Yonik Seeley 
> À : solr-user@lucene.apache.org
> Envoyé le : Jeudi, 10 Septembre 2009, 14h24mn 26s
> Objet : Re: Indexing fields dynamically
>
> On Thu, Sep 10, 2009 at 5:58 AM, nourredine khadri
>  wrote:
> > I want to index my fields dynamically.
> >
> > DynamicFields don't suit my need because I don't know fields name in
> advance and fields type must be set > dynamically too (need strong typage).
>
> This is what dynamic fields are meant for - you pick both the name and
> type (from a pre-defined set of types of course) at runtime.  The
> suffix of the field name matches one of the dynamic fields and
> essentially picks the type.
>
> -Yonik
> http://www.lucidimagination.com
>
>
>
>




-- 
Lance Norskog
goks...@gmail.com


Re: An issue with using Solr Cell and multiple files

2009-09-10 Thread caman

You are right. 
I got into same thing. Windows curl gave me error but cygwin ran without any
issues.

thanks


Lance Norskog-2 wrote:
> 
> It is a windows problem (or curl, whatever).  This works with
> double-quotes.
> 
> C:\Users\work\Downloads>\cygwin\home\work\curl-7.19.4\curl.exe
> http://localhost:8983/solr/update --data-binary "" -H
> "Content-type:text/xml; charset=utf-8"
> Single-quotes inside double-quotes should work: " waitFlush='false'/>"
> 
> 
> On Tue, Sep 8, 2009 at 11:59 AM, caman
> wrote:
> 
>>
>> seems to be an error with curl
>>
>>
>>
>>
>> Kevin Miller-17 wrote:
>> >
>> > I am getting the same error message.  I am running Solr on a Windows
>> > machine.  Is the commit command a curl command or is it a Solr command?
>> >
>> >
>> > Kevin Miller
>> > Web Services
>> >
>> > -Original Message-
>> > From: Grant Ingersoll [mailto:gsing...@apache.org]
>> > Sent: Tuesday, September 08, 2009 12:52 PM
>> > To: solr-user@lucene.apache.org
>> > Subject: Re: An issue with  using Solr Cell and multiple files
>> >
>> > solr/examples/exampledocs/post.sh does:
>> > curl $URL --data-binary '' -H 'Content-type:text/xml;
>> > charset=utf-8'
>> >
>> > Not sure if that helps or how it compares to the book.
>> >
>> > On Sep 8, 2009, at 1:48 PM, Kevin Miller wrote:
>> >
>> >> I am using the Solr nightly build from 8/11/2009.  I am able to index
>> >> my documents using the Solr Cell but when I attempt to send the commit
>> >
>> >> command I get an error.  I am using the example found in the Solr 1.4
>> >> Enterprise Search Server book (recently released) found on page 84.
>> >> It
>> >> shows to commit the changes as follows (I am showing where my files
>> >> are located not the example in the book):
>> >>
>>  c:\curl\bin\curl http://echo12:8983/solr/update/ -H "Content-Type:
>> >> text/xml" --data-binary ''
>> >>
>> >> this give me this error: The system cannot find the file specified.
>> >>
>> >> I get the same error when I modify it to look like the following:
>> >>
>>  c:\curl\bin\curl http://echo12:8983/solr/update/ '> >> waitFlush="false"/>'
>>  c:\curl\bin\curl "http://echo12:8983/solr/update/"; -H "Content-Type:
>> >> text/xml" --data-binary ''
>>  c:\curl\bin\curl http://echo12:8983/solr/update/ ''
>>  c:\curl\bin\curl "http://echo12:8983/solr/update/"; ''
>> >>
>> >> I am using the example configuration in Solr so my documents are found
>> >
>> >> in the exampledocs folder also my curl program in located in the root
>> >> directory which is the reason for the way the curl command is being
>> >> executed.
>> >>
>> >> I would appreciate any information on where to look or how to get the
>> >> commit command to execute after indexing multiple files.
>> >>
>> >> Kevin Miller
>> >> Oklahoma Tax Commission
>> >> Web Services
>> >
>> > --
>> > Grant Ingersoll
>> > http://www.lucidimagination.com/
>> >
>> > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
>> > using Solr/Lucene:
>> > http://www.lucidimagination.com/search
>> >
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/An-issue-with-%3Ccommit-%3E-using-Solr-Cell-and-multiple-files-tp25350995p25352122.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Lance Norskog
> goks...@gmail.com
> 
> 

-- 
View this message in context: 
http://www.nabble.com/An-issue-with-%3Ccommit-%3E-using-Solr-Cell-and-multiple-files-tp25350995p25394203.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to Convert Lucene index files to XML Format

2009-09-10 Thread Lance Norskog
It is best to start off with Solr by playing around with the example in the
example/ directory. Index the data in the example/exampledocs directory, do
some searches, look at the index with the admin/luke page. After that, this
will be much easier.

 To bring your Lucene under Solr, you have to examine the design of the
Lucene index and create a matching Solr schema in solr/conf/schema.xml.


On 9/10/09, busbus  wrote:
>
>
> Thanks for your reply
>
>
>
>
>
> > On Sep 10, 2009, at 6:41 AM, busbus wrote:
> > Solr defers to Lucene on reading the index.  You just need to tell
> > Solr whether the index is a compound file or not and make sure the
> > versions are compatible.
> >
>
> This part seems to be the point.
> How to make solr to read lucene index files.
> There is a tag in Solrconfig.xml
>  false 
>
> Enable it to true does not seem to be working.
>
> What else need to be done.
>
> Should i change the config file or add new tag.
>
> Also how to check the compatibility of Lucen and solr
>
> Thanks in advance
>
> --
> View this message in context:
> http://www.nabble.com/How-to-Convert-Lucene-index-files-to-XML-Format-tp25381017p25382367.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Lance Norskog
goks...@gmail.com


Re: Query regarding incremental index replication

2009-09-10 Thread Lance Norskog
There is only one index. The index has newer "segments" which represent new
records and deletes to old records (sort of). Incremental replication copies
new segments; putting the new segments together with the previous index
makes the new index.

Incremental replication under rsync does work; perhaps it did not work for
you.

If you do not want to store the full index on the indexer, that is a
problem. You will not be able to optimize the index on the indexer and ship
the new index to the slaves.

This has more on large-volume Solr installation design:

http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr

On 9/9/09, Silent Surfer  wrote:
>
> Hi ,
>
> Currently we are using Solr 1.3 and we have the following requirement.
>
> As we need to process very high volumes of documents (of the order of 400
> GB per day), we are planning to separate indexer(s) and searcher(s), so that
> there won't be performance hit.
>
> Our idea is to have have a set of servers which is used only for indexers
> for index creation and then every 5 mins or so, the index will be copied to
> the searchers(set of solr servers only for querying). For this we tried to
> use the snapshooter,rsysnc etc.
>
> But the problem with this approach is, the same index is present on both
> the indexer and searcher, and hence occupying large FS.
>
> What we need is a mechanism, where in the indexer contains only the index
> for the past 5 mins(last indexing cycle before the snap shooter is run) and
> the searcher should have the accumulated(total) index i.e every 5 mins, we
> should be able to move the entire index from indexer to searcher and so on.
>
> The above scenario is slightly different from master/slave implementation,
> as on master we want only the latest(WIP) index and the slave should contain
> the entire index.
>
> Appreciate if anyone can throw some light on how to achieve this.
>
> Thanks,
> sS
>
>
>
>
>


-- 
Lance Norskog
goks...@gmail.com


Re: Very slow first query

2009-09-10 Thread Jonathan Ariel
Yes, but in this case the query that I'm executing doesn't have any facet. I
mean for this query I'm not using any filter cache.What does it means
"operating system cache can be significant"? That my first query uploads a
big chunk on the index into memory (maybe even the entire index)?

On Thu, Sep 10, 2009 at 10:07 PM, Yonik Seeley
wrote:

> At 12M documents, operating system cache can be significant.
> Also, the first time you sort or facet on a field, a field cache
> instance is populated which can take a lot of time.  You can prevent
> slow first queries by configuring a static warming query in
> solrconfig.xml that includes the common sorts and facets.
>
> -Yonik
> http://www.lucidimagination.com
>
> On Thu, Sep 10, 2009 at 8:55 PM, Jonathan Ariel 
> wrote:
> > Hi!Why would it take for the first query that I execute almost 60 seconds
> to
> > run and after that no more than 50ms? I disabled all my caching to check
> if
> > it is the reason for the subsequent fast responses, but the same happens.
> > I'm using solr 1.3.
> > Something really strange is that it doesn't happen with all the queries.
> It
> > is happening with a query that filters some integer and string fields
> joined
> > by an AND operator. Something like A:1 AND B:2 AND (C:3 AND D:"CA")
> (exact
> > match).
> > My index is around 1200M documents.
> >
> > Thanks,
> >
> > Jonathan
> >
>


Re: Very Urjent

2009-09-10 Thread Lance Norskog
Another, slower way is to create a spell checking dictionary and do spelling
requests on the first few characters the user types.
http://wiki.apache.org/solr/SpellCheckerRequestHandler?highlight=%28spell%29%7C%28checker%29

Another way is to search against facet values with the facet.prefix feature:
http://wiki.apache.org/solr/SimpleFacetParameters?highlight=%28facet%29%7C%28%2A%29#head-021d583a1430f6485c6e929930fceec3e15e1e8a

All of these have the same problem: programmers are all perfect spellers,
while normal people are not. None of these techniques assist normal people
to find homonyms.


On 9/9/09, dharhsana  wrote:
>
>
> Hi Shalin Shekhar Mangar,
>
> I got some come from this site
>
> http://www.mattweber.org/2009/05/02/solr-autosuggest-with-termscomponent-and-jquery/
>
> When i use that code in my project ,then only i came to know that there is
> no Termscomponent jar or plugin ..
>
> There is any other way for doing autocompletion search with out terms
> component.
>
> If so please tell me how to implement it.
>
> waiting for your reply
>
> Regards,
>
> Rekha.
>
>
>
> --
> View this message in context:
> http://www.nabble.com/Very-Urjent-tp25359244p25360892.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Lance Norskog
goks...@gmail.com


Re: Why dismax isn't the default with 1.4 and why it doesn't support fuzzy search ?

2009-09-10 Thread Lance Norskog
A QueryParser is a Lucene class that parses a string into a tree of query
objects.

A request handler in solrconfig.xml describes a Solr RequestHandler object.
This object binds strings into http parameter strings. If a request handler
name is "/abc" then it is called by
http://localhot:8983/solr/abc but if there is no slash, the name "abc" is
available when some other request handler is called. "Available" means that
some other code can search for the name. In "qt=dismax", the code that
searches for &qt knows that dismax is a requesthandler.

(It all made sense when I started typing ...)


On 9/9/09, Villemos, Gert  wrote:
>
> Sorry for being a bit dim, I dont understand this;
>
> Looking at my default configuration for SOLR, I have a request handler
> named 'dismax' and request handler named 'standard' with the default="true".
> I understand that I can configure the usage of this in the query using the
> qt=dismax or qt=standard (... Or no qt as standard is set to default). And
> if I set the 'defType=dismax' flag in the standard requesthandler then I
> will use the dismax queryparser per default. This far, so good.
>
> What I dont understand is whether a requesthandler and a queryparser is the
> same thing, i.e. The configuration contains a REQUESTHANDLER with the name
> 'dismax', but does not contain a QUERYPARSER with the name 'dismax'. Where
> does the 'dismax' queryparser come from? Do I have to configure this extra?
> Or is it there per default? Or does it come from the 'dismax'
> requesthandler?
>
> Gert.
>
>
>
>
>
>
> -Original Message-
> From: kaoul@gmail.com [mailto:kaoul@gmail.com] On Behalf Of Erwin
> Sent: Wednesday, September 09, 2009 10:55 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Why dismax isn't the default with 1.4 and why it doesn't
> support fuzzy search ?
>
> Hi Gert,
>
> &qt=dismax in URL works with Solr 1.3 and 1.4 without further
> configuration. You are right, you should find a "dismax" query parser in
> solrconfig.xml by default.
>
> Erwin
>
> On Wed, Sep 9, 2009 at 7:49 AM, Villemos, Gert
> wrote:
> > On question to this;
> >
> > Do you need to explicitly configure a 'dismax' queryparser in the
> > solrconfig.xml to enable this, or is a queryparser named 'dismax'
> > available per default?
> >
> > Cheers,
> > Gert.
> >
> >
> >
> >
> > -Original Message-
> > From: Chris Hostetter [mailto:hossman_luc...@fucit.org]
> > Sent: Wednesday, September 02, 2009 2:44 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Why dismax isn't the default with 1.4 and why it doesn't
> > support fuzzy search ?
> >
> > : The wiki says "As of Solr 1.3, the DisMaxRequestHandler is simply
> > the
> > : standard request handler with the default query parser set to the
> > : DisMax Query Parser (defType=dismax).". I just made a checkout of
> > svn
> > : and dismax doesn't seems to be the default as :
> >
> > that paragraph doesn't say that dismax is the "default handler" ... it
> > says that using qt=dismax is the same as using qt=standard with the "
> > query parser" set to be the DisMaxQueryParser (using defType=dismax)
> >
> >
> > so doing this replacement on any URL...
> >
> >qt=dismax   =>  qt=standard&defTYpe=dismax
> >
> > ...should produce identical results.
> >
> > : Secondly, I've patched solr with
> > : http://issues.apache.org/jira/browse/SOLR-629 as I would like to
> > have
> > : fuzzy with dismax. I built it with "ant example". Now, behavior is
> > : still the same, no fuzzy search with dismax (using the qt=dismax
> > : parameter in GET URL).
> >
> > questions/discussion of uncommitted patches is best done in the Jira
> > issue wherey ou found the patch ... that way it helps other people
> > evaluate the patch, and the author of the patch is more likelye to see
> > your feedback.
> >
> >
> > -Hoss
> >
> >
> >
> > Please help Logica to respect the environment by not printing this
> email  / Pour contribuer comme Logica au respect de l'environnement, merci
> de ne pas imprimer ce mail /  Bitte drucken Sie diese Nachricht nicht aus
> und helfen Sie so Logica dabei, die Umwelt zu schützen. /  Por favor ajude a
> Logica a respeitar o ambiente nao imprimindo este correio electronico.
> >
> >
> >
> > This e-mail and any attachment is for authorised use by the intended
> recipient(s) only. It may contain proprietary material, confidential
> information and/or be subject to legal privilege. It should not be copied,
> disclosed to, retained or used by, any other party. If you are not an
> intended recipient then please promptly delete this e-mail and any
> attachment and all copies and inform the sender. Thank you.
> >
> >
> >
>
>
> Please help Logica to respect the environment by not printing this email  /
> Pour contribuer comme Logica au respect de l'environnement, merci de ne pas
> imprimer ce mail /  Bitte drucken Sie diese Nachricht nicht aus und helfen
> Sie so Logica dabei, die Umwelt zu schützen. /  Por favor ajude a Logica a
> respeitar o ambiente n

Re: Date Faceting and Double Counting

2009-09-10 Thread Lance Norskog
datefield:[X TO* Y] for X to Y-0....1

This would be backwards-compatible. {} are used for other things and lexing
is a dying art. Using a * causes mistakes to trigger wildcard syntaxes,
which will fail loudly.

On Tue, Sep 8, 2009 at 5:20 PM, Chris Hostetter wrote:

>
> : I ran into that problem as well but the solution was provided to me by
> : this very list :) See
> : http://www.nabble.com/Range-queries-td24057317.html It's not the
> : cleanest solution, but as long as you know what you're doing it's not
> : that bad.
>
> Hmmm... yeah, that's a total hack.  one of these days we really need to
> fix the lucene query parser grammer so inclusive/exclusive can be
> different for hte upper/lower bounds...
>
>datefield:[NOW/DAY TO NOW/DAY+1DAY}
>
>
> -Hoss
>
>


-- 
Lance Norskog
goks...@gmail.com


RE: Extract info from parent node during data import

2009-09-10 Thread venn hardy

Hi Fergus,

When I debugged in the development console 
http://localhost:9080/solr/admin/dataimport.jsp?handler=/dataimport

I had no problems. Each category/item seems to be only indexed once, and no 
parent fields are available (except the category name).

I am not entirely sure how the forEach statement works, but my interpretation 
of forEach="/document/category/item | /document/category" is something like 
this:

1. Whenever DIH encounters a document/category it will extract the 
/document/category/

name field as a common field
2. Whenever DIH encounters a document/category/item it will extract all of the 
item fields.
3. When all fields have been encountered, save the document in solr and go to 
the next category/item

 
> Date: Thu, 10 Sep 2009 14:19:31 +0100
> To: solr-user@lucene.apache.org
> From: fer...@twig.me.uk
> Subject: RE: Extract info from parent node during data import
> 
> >Hi Paul,
> >The forEach="/document/category/item | /document/category/name" didn't work 
> >(no categoryname was stored or indexed).
> >However forEach="/document/category/item | /document/category" seems to work 
> >well. I am not sure why category on its own works, but not category/name...
> >But thanks for tip. It wasn't as painful as I thought it would be.
> >Venn
> 
> Hmmm, I had bother with this. Although each occurance of 
> /document/category/item 
> causes a new solr document to indexed, that document contained all the fields 
> from
> the parent element as well.
> 
> Did you see this?
> 
> >
> >> From: noble.p...@corp.aol.com
> >> Date: Thu, 10 Sep 2009 09:58:21 +0530
> >> Subject: Re: Extract info from parent node during data import
> >> To: solr-user@lucene.apache.org
> >> 
> >> try this
> >> 
> >> add two xpaths in your forEach
> >> 
> >> forEach="/document/category/item | /document/category/name"
> >> 
> >> and add a field as follows
> >> 
> >>  >> commonField="true"/>
> >> 
> >> Please try it out and let me know.
> >> 
> >> On Thu, Sep 10, 2009 at 7:30 AM, venn hardy  wrote:
> >> >
> >> > Hello,
> >> >
> >> >
> >> >
> >> > I am using SOLR 1.4 (from nighly build) and its URLDataSource in 
> >> > conjunction with the XPathEntityProcessor. I have successfully imported 
> >> > XML content, but I think I may have found a limitation when it comes to 
> >> > the commonField attribute in the DataImportHandler.
> >> >
> >> >
> >> >
> >> > Before writing my own parser to read in a whole XML document, I thought 
> >> > I'd post the question here (since I got some great advice last time).
> >> >
> >> >
> >> >
> >> > The bulk of my content is contained within each  tag. However, 
> >> > each item has a parent called  and each category has a name 
> >> > which I would like to import. In my forEach loop I specify the 
> >> > /document/category/item as the collection of items I am interested in. 
> >> > Is there anyway to extract an element from underneath a parent node? To 
> >> > be a more more specific (see eg xml below). I would like to index the 
> >> > following:
> >> >
> >> > - category: Category 1; id: 1; author: Author 1
> >> >
> >> > - category: Category 1; id: 2; author: Author 2
> >> >
> >> > - category: Category 2; id: 3; author: Author 3
> >> >
> >> > - category: Category 2; id: 4; author: Author 4
> >> >
> >> >
> >> >
> >> > Any ideas on how I can get to a parent node from within a child during 
> >> > data import? If it cant be done, what do you suggest would be the best 
> >> > way so I can keep using the DataImportHandler... would XSLT be a good 
> >> > idea to 'flatten out' the structure a bit?
> >> >
> >> >
> >> >
> >> > Thanks
> >> >
> >> >
> >> >
> >> > This is what my XML document looks like:
> >> >
> >> > 
> >> > 
> >> > Category 1
> >> > 
> >> > 1
> >> > Author 1
> >> > 
> >> > 
> >> > 2
> >> > Author 2
> >> > 
> >> > 
> >> > 
> >> > Category 2
> >> > 
> >> > 3
> >> > Author 3
> >> > 
> >> > 
> >> > 4
> >> > Author 4
> >> > 
> >> > 
> >> > 
> >> >
> >> >
> >> >
> >> > And this is what my dataConfig looks like:
> >> > 
> >> > 
> >> > 
> >> >  >> > url="http://localhost:9080/data/20090817070752.xml"; 
> >> > processor="XPathEntityProcessor" forEach="/document/category/item" 
> >> > transformer="DateFormatTransformer" stream="true" 
> >> > dataSource="dataSource">
> >> >  >> > commonField="true" />
> >> > 
> >> > 
> >> > 
> >> > 
> >> > 
> >> >
> >> >
> >> >
> >> > This is how I have specified my schema
> >> > 
> >> >  >> > required="true" />
> >> > 
> >> > 
> >> > 
> >> >
> >> > id
> >> > id
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > _
> >> > Need a place to rent, buy or share? Let us find your next place for you!
> >> > http://clk.atdmt.com/NMN/go/157631292/direct/01/
> >> 
> >> 
> >> 
> >> -- 
> >> -
> >> Noble Paul | Principal Engineer| AOL | http://aol.com
> >
> >_
> >Get Hotmail on your iPhone Find out how

Re: Dynamically building the value of a field upon indexing

2009-09-10 Thread Lance Norskog
This has to be done by an UpdateRequestProcessor

http://wiki.apache.org/solr/UpdateRequestProcessor




On Tue, Sep 8, 2009 at 3:34 PM, Villemos, Gert wrote:

> I would like to build the value of a field based on the value of multiple
> other fields at submission time. I.e. I would like to submit a document such
> as;
>
> foo
> baa
>
> And would like SOLR to store the document as
>
> foo
> baa
> foo:baa
>
> Just to complicate matters I would like the aggregated field to be the
> unique key.
>
> Is this possible?
>
> Thanks,
> Gert.
>
>
> Please help Logica to respect the environment by not printing this email  /
> Pour contribuer comme Logica au respect de l'environnement, merci de ne pas
> imprimer ce mail /  Bitte drucken Sie diese Nachricht nicht aus und helfen
> Sie so Logica dabei, die Umwelt zu schützen. /  Por favor ajude a Logica a
> respeitar o ambiente nao imprimindo este correio electronico.
>
>
>
> This e-mail and any attachment is for authorised use by the intended
> recipient(s) only. It may contain proprietary material, confidential
> information and/or be subject to legal privilege. It should not be copied,
> disclosed to, retained or used by, any other party. If you are not an
> intended recipient then please promptly delete this e-mail and any
> attachment and all copies and inform the sender. Thank you.
>
>


-- 
Lance Norskog
goks...@gmail.com


Re: Very slow first query

2009-09-10 Thread Yonik Seeley
At 12M documents, operating system cache can be significant.
Also, the first time you sort or facet on a field, a field cache
instance is populated which can take a lot of time.  You can prevent
slow first queries by configuring a static warming query in
solrconfig.xml that includes the common sorts and facets.

-Yonik
http://www.lucidimagination.com

On Thu, Sep 10, 2009 at 8:55 PM, Jonathan Ariel  wrote:
> Hi!Why would it take for the first query that I execute almost 60 seconds to
> run and after that no more than 50ms? I disabled all my caching to check if
> it is the reason for the subsequent fast responses, but the same happens.
> I'm using solr 1.3.
> Something really strange is that it doesn't happen with all the queries. It
> is happening with a query that filters some integer and string fields joined
> by an AND operator. Something like A:1 AND B:2 AND (C:3 AND D:"CA") (exact
> match).
> My index is around 1200M documents.
>
> Thanks,
>
> Jonathan
>


Re: An issue with using Solr Cell and multiple files

2009-09-10 Thread Lance Norskog
It is a windows problem (or curl, whatever).  This works with double-quotes.

C:\Users\work\Downloads>\cygwin\home\work\curl-7.19.4\curl.exe
http://localhost:8983/solr/update --data-binary "" -H
"Content-type:text/xml; charset=utf-8"
Single-quotes inside double-quotes should work: ""


On Tue, Sep 8, 2009 at 11:59 AM, caman wrote:

>
> seems to be an error with curl
>
>
>
>
> Kevin Miller-17 wrote:
> >
> > I am getting the same error message.  I am running Solr on a Windows
> > machine.  Is the commit command a curl command or is it a Solr command?
> >
> >
> > Kevin Miller
> > Web Services
> >
> > -Original Message-
> > From: Grant Ingersoll [mailto:gsing...@apache.org]
> > Sent: Tuesday, September 08, 2009 12:52 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: An issue with  using Solr Cell and multiple files
> >
> > solr/examples/exampledocs/post.sh does:
> > curl $URL --data-binary '' -H 'Content-type:text/xml;
> > charset=utf-8'
> >
> > Not sure if that helps or how it compares to the book.
> >
> > On Sep 8, 2009, at 1:48 PM, Kevin Miller wrote:
> >
> >> I am using the Solr nightly build from 8/11/2009.  I am able to index
> >> my documents using the Solr Cell but when I attempt to send the commit
> >
> >> command I get an error.  I am using the example found in the Solr 1.4
> >> Enterprise Search Server book (recently released) found on page 84.
> >> It
> >> shows to commit the changes as follows (I am showing where my files
> >> are located not the example in the book):
> >>
>  c:\curl\bin\curl http://echo12:8983/solr/update/ -H "Content-Type:
> >> text/xml" --data-binary ''
> >>
> >> this give me this error: The system cannot find the file specified.
> >>
> >> I get the same error when I modify it to look like the following:
> >>
>  c:\curl\bin\curl http://echo12:8983/solr/update/ ' >> waitFlush="false"/>'
>  c:\curl\bin\curl "http://echo12:8983/solr/update/"; -H "Content-Type:
> >> text/xml" --data-binary ''
>  c:\curl\bin\curl http://echo12:8983/solr/update/ ''
>  c:\curl\bin\curl "http://echo12:8983/solr/update/"; ''
> >>
> >> I am using the example configuration in Solr so my documents are found
> >
> >> in the exampledocs folder also my curl program in located in the root
> >> directory which is the reason for the way the curl command is being
> >> executed.
> >>
> >> I would appreciate any information on where to look or how to get the
> >> commit command to execute after indexing multiple files.
> >>
> >> Kevin Miller
> >> Oklahoma Tax Commission
> >> Web Services
> >
> > --
> > Grant Ingersoll
> > http://www.lucidimagination.com/
> >
> > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
> > using Solr/Lucene:
> > http://www.lucidimagination.com/search
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/An-issue-with-%3Ccommit-%3E-using-Solr-Cell-and-multiple-files-tp25350995p25352122.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Lance Norskog
goks...@gmail.com


Very slow first query

2009-09-10 Thread Jonathan Ariel
Hi!Why would it take for the first query that I execute almost 60 seconds to
run and after that no more than 50ms? I disabled all my caching to check if
it is the reason for the subsequent fast responses, but the same happens.
I'm using solr 1.3.
Something really strange is that it doesn't happen with all the queries. It
is happening with a query that filters some integer and string fields joined
by an AND operator. Something like A:1 AND B:2 AND (C:3 AND D:"CA") (exact
match).
My index is around 1200M documents.

Thanks,

Jonathan


Using EnglishPorterFilterFactory in code

2009-09-10 Thread darniz

hello
i have a task where my user is giving me 20 words of english dictionary and
i have to run a program and generate a report with all stemmed words.

I have to use EnglishPorterFilterFactory and SnowballPorterFilterFactory to
check which one is faster and gets the best results

Should i write a java module and use the library which comes with solr.
is there any code snipped which i can use

Is there any utiltiy which solr provides. 

If i can get a faint idea of how to do it is to create EnglishPorterFilter
from EnglishPorterFilterFactory by passing a tokenizer etc...

i will appreciate if some one can give me a hint on this.

thanks
darniz

-- 
View this message in context: 
http://www.nabble.com/Using-EnglishPorterFilterFactory-in-code-tp25393325p25393325.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: SnowballPorterFilterFactory stemming word question

2009-09-10 Thread darniz

Thanks Yonik
i have a task where my user is giving me 20 words of english dictionary and
i have to run a program and generate a report with all stemmed words.

I have to use EnglishPorterFilterFactory and SnowballPorterFilterFactory to
check which one is faster and gets the best results

Should i write a java module and use the library which comes with solr.
is there any code snipped which i can use

If i can get a faint idea of how to do it is to create EnglishPorterFilter
from EnglishPorterFilterFactory by passing a tokenizer etc...

i will appreciate if some one can give me a hint on this.

thanks
darniz









Yonik Seeley-2 wrote:
> 
> On Mon, Sep 7, 2009 at 2:49 AM, darniz wrote:
>> Does solr provide any implementation for dictionary stemmer, please let
>> me
>> know
> 
> The Krovetz stemmer is dictionary based (english only):
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem
> 
> But from your original question, maybe you are concerned when the
> stemmer doesn't return real words? For normal search, don't be.
> During index time, words are stemmed, and then later the query is
> stemmed.  If the results match up, you're good.  For example, a
> document containing the word "machines" may stem to "machin" and then
> a query of "machined" will stem to "machin" and thus match the
> document.
> 
> 
> -Yonik
> http://www.lucidimagination.com
> 
> 

-- 
View this message in context: 
http://www.nabble.com/SnowballPorterFilterFactory-stemming-word-question-tp25180310p25393323.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query runs faster without filter queries?

2009-09-10 Thread Jonathan Ariel
Thanks! I don't think I can use an unreleased version of solr even is it's
stable enough (crazy infrastructure guys) but I might be able to apply the 2
patches mentioned in the link you sent. I will try it in my local copy of
solr and see if it improves and let you know.
Thanks!

On Thu, Sep 10, 2009 at 5:43 PM, Uri Boness  wrote:

> If I recall correctly, in solr 1.3 there was an issue where filters didn't
> really behaved as they should have. Basically, if you had a query and
> filters defined, the query would have executed normally and only after that
> the filter would be applied. AFAIK this is fixed in 1.4 where now the
> documents which are defined by the filters are skipped during the query
> execution.
>
> Uri
>
>
> Jonathan Ariel wrote:
>
>> Hi all!
>> I'm trying to measure the query response time when using just a query and
>> when using some filter queries. From what I read and understand adding
>> filter query should boost the query response time. I used luke to
>> understand
>> over which fields I should use filter query (those that have few unique
>> terms, in my case 2 fields of 30 and 400 unique fields). I'm using solr
>> 1.3.
>> In order to test the query performance I disabled queryCache and
>> documentCache, so I just have filterCache enabled.I did that because I
>> wanted to be sure that there is no caching when I measure my queries. I
>> left
>> filterCache because it makes sense since filter query uses that.
>>
>> When I first execute my query without filter cache it runs in 400ms, next
>> execution of the same query around 20ms.
>> When I first execute my query with filter cache it runs in 500ms, next
>> execution of the same query around 50ms.
>>
>> Why the query with filter query runs slower than the query without filter
>> query? Shouldn't it be the other way around?
>>
>> My index is around 12M documents. My filterCache max size is set to 4
>> (I
>> think more than enough). The fields that I use as filter queries are
>> integer
>> and in my query I search over a tokenized text field.
>>
>> What do you think?
>>
>> Thanks a lot,
>>
>> Jonathan
>>
>>
>>
>


Re: Single Core or Multiple Core?

2009-09-10 Thread Jonathan Ariel
Yes, it seems like I don't need to split. I could use different commit
times. In my use case it is too often and I could have a different commit
time on a country basis.Your questions made me rethink the need of splitting
into cores.

Thanks

On Fri, Sep 4, 2009 at 5:38 AM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> On Fri, Sep 4, 2009 at 4:35 AM, Jonathan Ariel  wrote:
>
> > It seems like it is really hard to decide when the Multiple Core solution
> > is
> > more appropriate.As I could understand from this list and wiki the
> Multiple
> > Core feature was designed to address the need of handling different sets
> of
> > data within the same solr instance, where the sets of data don't need to
> be
> > joined.
> >
>
> Correct. It is also useful when you don't want to setup multiple boxes or
> tomcats for each Solr.
>
>
> > In my case the documents are of a specific site and country. So document
> A
> > can be of Site 1 / Country 1, B of Site 2 / Country 1, C of Site 1 /
> > Country
> > 2, and so on.
> > For the use cases of my application I will never query across countries
> or
> > sites. I will always have to provide to the query the country id and the
> > site id.
> > Would you suggest to split my data into cores? I have few sites (around
> 20)
> > and more countries (around 90).
> > Should I split my data into sites (around 20 cores) and within a core
> > filter
> > by site? Should I split by Site and Country (around 1800 cores)?
> > What should I consider when splitting my data into multiple cores?
> >
> >
> The first question is why do you want to split at all? Is the schema or
> solrconfig different? Are the different sites or countries updated at
> different times? Is the combined index very big that the response times
> jump
> wildly when all the caches are thrown out if documents related to one site
> or country are updated? Does warmup or optimize or replication take too
> much
> time with one big index?
>
> Each core will have its own configuration files (maintenance) and you need
> to setup replication separately for each core (which is a pain with the
> script based replication). Also note that by keeping all cores in one
> tomcat
> (one JVM), a stop-the-world GC will stop all cores which is not the case
> when using separate JVMs for each index/core.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


Re: Highlighting in SolrJ?

2009-09-10 Thread Paul Tomblin
If I set snippets to 9 and "mergeContinuous" to true, will I get
the entire contents of the field with all the search terms replaced?
I don't see what good it would be just getting one line out of the
whole field as a snippet.

On Thu, Sep 10, 2009 at 7:45 PM, Jay Hill  wrote:
> Set up the query like this to highlight a field named "content":
>
>    SolrQuery query = new SolrQuery();
>    query.setQuery("foo");
>
>    query.setHighlight(true).setHighlightSnippets(1); //set other params as
> needed
>    query.setParam("hl.fl", "content");
>
>    QueryResponse queryResponse =getSolrServer().query(query);
>
> Then to get back the highlight results you need something like this:
>
>    Iterator iter = queryResponse.getResults();
>
>    while (iter.hasNext()) {
>      SolrDocument resultDoc = iter.next();
>
>      String content = (String) resultDoc.getFieldValue("content"));
>      String id = (String) resultDoc.getFieldValue("id"); //id is the
> uniqueKey field
>
>      if (queryResponse.getHighlighting().get(id) != null) {
>        List highightSnippets =
> queryResponse.getHighlighting().get(id).get("content");
>      }
>    }
>
> Hope that gets you what you need.
>
> -Jay
> http://www.lucidimagination.com
>
> On Thu, Sep 10, 2009 at 3:19 PM, Paul Tomblin  wrote:
>
>> Can somebody point me to some sample code for using highlighting in
>> SolrJ?  I understand the highlighted versions of the field comes in a
>> separate NamedList?  How does that work?
>>
>> --
>> http://www.linkedin.com/in/paultomblin
>>
>



-- 
http://www.linkedin.com/in/paultomblin


Re: Highlighting in SolrJ?

2009-09-10 Thread Jay Hill
Set up the query like this to highlight a field named "content":

SolrQuery query = new SolrQuery();
query.setQuery("foo");

query.setHighlight(true).setHighlightSnippets(1); //set other params as
needed
query.setParam("hl.fl", "content");

QueryResponse queryResponse =getSolrServer().query(query);

Then to get back the highlight results you need something like this:

Iterator iter = queryResponse.getResults();

while (iter.hasNext()) {
  SolrDocument resultDoc = iter.next();

  String content = (String) resultDoc.getFieldValue("content"));
  String id = (String) resultDoc.getFieldValue("id"); //id is the
uniqueKey field

  if (queryResponse.getHighlighting().get(id) != null) {
List highightSnippets =
queryResponse.getHighlighting().get(id).get("content");
  }
}

Hope that gets you what you need.

-Jay
http://www.lucidimagination.com

On Thu, Sep 10, 2009 at 3:19 PM, Paul Tomblin  wrote:

> Can somebody point me to some sample code for using highlighting in
> SolrJ?  I understand the highlighted versions of the field comes in a
> separate NamedList?  How does that work?
>
> --
> http://www.linkedin.com/in/paultomblin
>


Re: about SOLR-1395 integration with katta

2009-09-10 Thread Jason Rutherglen
Hi Zhong,

For #2 the existing patch SOLR-1395 is a good start.  It should be
fairly simple to deploy indexes and distribute them to Solr Katta
nodes/servers.

-J

On Wed, Sep 9, 2009 at 11:41 PM, Zhenyu Zhong  wrote:
> Jason,
>
> Thanks for the reply.
>
> In general, I would like to use katta to handle the management overhead such
> as single point of failure as well as the distributed index deployment. In
> the same time, I still want to use nice search features provided by solr.
>
> Basically, I would like to try both on the indexing part
> 1. Using Hadoop to lauch MR jobs to build index. Then deploy the index to
> katta
> 2. Using the new patch SOLR-1935
>    Based on my understandings, it seems to support index building with
> Hadoop. I assume the index would have all the necessary information such as
> solr index schema so that I can still use the nice search features provided
> by solr.
>
> On the search part,
> I would like to try the distributed search on solr-index which is deployed
> on katta if that is possible.
>
> I would be very appreciated if you could share some thoughts with me.
>
> thanks
> zhong
>
>
>
> On Wed, Sep 9, 2009 at 6:06 PM, Jason Rutherglen > wrote:
>
>> Hi Zhong,
>>
>> It's a very new patch. I'll update the issue as we start the
>> wiki page.
>>
>> I've been working on indexing in Hadoop in conjunction with
>> Katta, which is different (it sounds) than your use case where
>> you have prebuilt indexes you simply want to distributed using
>> Katta?
>>
>> -J
>>
>> On Wed, Sep 9, 2009 at 12:33 PM, Zhenyu Zhong 
>> wrote:
>> > Hi,
>> >
>> > It is really exciting to see this integration coming out.
>> > May I ask how I need to make changes to be able to deploy Solr index on
>> > katta servers?
>> > Are there any tutorials?
>> >
>> > thanks
>> > zhong
>> >
>>
>


Highlighting in SolrJ?

2009-09-10 Thread Paul Tomblin
Can somebody point me to some sample code for using highlighting in
SolrJ?  I understand the highlighted versions of the field comes in a
separate NamedList?  How does that work?

-- 
http://www.linkedin.com/in/paultomblin


shards and facet_count

2009-09-10 Thread Paul Rosen

Hi again,

I've mostly gotten the multicore working except for one detail.

(I'm using solr 1.3 and solr-ruby 0.0.6 in a rails project.)

I've done a few queries and I appear to be able to get hits from either 
core. (yeah!)


I'm forming my request like this:

req = Solr::Request::Standard.new(
  :start => start,
  :rows => max,
  :sort => sort_param,
  :query => query,
  :filter_queries => filter_queries,
  :field_list => @field_list,
  :facets => {:fields => @facet_fields, :mincount => 1, :missing => 
true, :limit => -1},

  :highlighting => {:field_list => ['text'], :fragment_size => 600},
  :shards => @cores)

If I leave ":shards => @cores" out, then the response includes:

'facet_counts' => {
  'facet_dates' => {},
  'facet_queries' => {},
  'facet_fields' => { 'myfacet' => [ etc...], etc... }

which is what I expect.

If I add the ":shards => @cores" back in (so that I'm doing the exact 
request above), I get:


'facet_counts' => {
  'facet_dates' => {},
  'facet_queries' => {},
  'facet_fields' => {}

so I've lost my facet information.

Why would it correctly find my documents, but not report the facet info?

Thanks,
Paul


Re: Query runs faster without filter queries?

2009-09-10 Thread Uri Boness
If I recall correctly, in solr 1.3 there was an issue where filters 
didn't really behaved as they should have. Basically, if you had a query 
and filters defined, the query would have executed normally and only 
after that the filter would be applied. AFAIK this is fixed in 1.4 where 
now the documents which are defined by the filters are skipped during 
the query execution.


Uri

Jonathan Ariel wrote:

Hi all!
I'm trying to measure the query response time when using just a query and
when using some filter queries. From what I read and understand adding
filter query should boost the query response time. I used luke to understand
over which fields I should use filter query (those that have few unique
terms, in my case 2 fields of 30 and 400 unique fields). I'm using solr 1.3.
In order to test the query performance I disabled queryCache and
documentCache, so I just have filterCache enabled.I did that because I
wanted to be sure that there is no caching when I measure my queries. I left
filterCache because it makes sense since filter query uses that.

When I first execute my query without filter cache it runs in 400ms, next
execution of the same query around 20ms.
When I first execute my query with filter cache it runs in 500ms, next
execution of the same query around 50ms.

Why the query with filter query runs slower than the query without filter
query? Shouldn't it be the other way around?

My index is around 12M documents. My filterCache max size is set to 4 (I
think more than enough). The fields that I use as filter queries are integer
and in my query I search over a tokenized text field.

What do you think?

Thanks a lot,

Jonathan

  


Re: Query runs faster without filter queries?

2009-09-10 Thread Yonik Seeley
Try 1.4
http://www.lucidimagination.com/blog/2009/05/27/filtered-query-performance-increases-for-solr-14/

-Yonik
http://www.lucidimagination.com



On Thu, Sep 10, 2009 at 4:35 PM, Jonathan Ariel  wrote:
> Hi all!
> I'm trying to measure the query response time when using just a query and
> when using some filter queries. From what I read and understand adding
> filter query should boost the query response time. I used luke to understand
> over which fields I should use filter query (those that have few unique
> terms, in my case 2 fields of 30 and 400 unique fields). I'm using solr 1.3.
> In order to test the query performance I disabled queryCache and
> documentCache, so I just have filterCache enabled.I did that because I
> wanted to be sure that there is no caching when I measure my queries. I left
> filterCache because it makes sense since filter query uses that.
>
> When I first execute my query without filter cache it runs in 400ms, next
> execution of the same query around 20ms.
> When I first execute my query with filter cache it runs in 500ms, next
> execution of the same query around 50ms.
>
> Why the query with filter query runs slower than the query without filter
> query? Shouldn't it be the other way around?
>
> My index is around 12M documents. My filterCache max size is set to 4 (I
> think more than enough). The fields that I use as filter queries are integer
> and in my query I search over a tokenized text field.
>
> What do you think?
>
> Thanks a lot,
>
> Jonathan
>


Query runs faster without filter queries?

2009-09-10 Thread Jonathan Ariel
Hi all!
I'm trying to measure the query response time when using just a query and
when using some filter queries. From what I read and understand adding
filter query should boost the query response time. I used luke to understand
over which fields I should use filter query (those that have few unique
terms, in my case 2 fields of 30 and 400 unique fields). I'm using solr 1.3.
In order to test the query performance I disabled queryCache and
documentCache, so I just have filterCache enabled.I did that because I
wanted to be sure that there is no caching when I measure my queries. I left
filterCache because it makes sense since filter query uses that.

When I first execute my query without filter cache it runs in 400ms, next
execution of the same query around 20ms.
When I first execute my query with filter cache it runs in 500ms, next
execution of the same query around 50ms.

Why the query with filter query runs slower than the query without filter
query? Shouldn't it be the other way around?

My index is around 12M documents. My filterCache max size is set to 4 (I
think more than enough). The fields that I use as filter queries are integer
and in my query I search over a tokenized text field.

What do you think?

Thanks a lot,

Jonathan


Re: Solr http post performance seems slow - help?

2009-09-10 Thread Dan A. Dickey
On Thursday 10 September 2009 01:47:38 pm Walter Underwood wrote:
> What kind of storage is used for the Solr index files? When I tested it, NFS
> was 100X slower than local disk.

I'm sorry - I misunderstood your question.  The Solr indexes themselves are
stored on local disk.  The documents are retrievable (for DIH) from NFS.

And, I started looking closer into this problem... both the box doing the
posts, and the solr box are around 90% idle while the indexing process is
running.  And there is no I/O wait time.
I'm now looking into possible network slowness...
-Dan

> 
> wunder 
> 
> -Original Message-
> From: Dan A. Dickey [mailto:dan.dic...@savvis.net] 
> Sent: Thursday, September 10, 2009 11:15 AM
> To: solr-user@lucene.apache.org
> Cc: Walter Underwood
> Subject: Re: Solr http post performance seems slow - help?
> 
> On Thursday 10 September 2009 09:10:27 am Walter Underwood wrote:
> > How big are your documents?
> 
> For the most part, I'm just indexing metadata that has been pulled from
> the documents.  I think I have currently about 40 or so fields that I'm
> setting.
> When the document is an actual document - pdf, doc, etc... I use the DIH
> to extract stuff and also set the metadata then.
> 
> > Is your index on local disk or network- 
> > mounted disk?
> 
> I'm basically pulling the metadata info from a database and the documents
> themselves are shared via NFS to the Solr indexer.
>   -Dan
> 
> > 
> > wunder
> > 
> > On Sep 10, 2009, at 6:39 AM, Yonik Seeley wrote:
> > 
> > > On Thu, Sep 10, 2009 at 9:13 AM, Dan A. Dickey  
> > >  wrote:
> > >> I'm posting documents to Solr using http (curl) from
> > >> C++/C code and am seeing approximately 3.3 - 3.4
> > >> documents per second being posted.  Is this to be expected?
> > >
> > > No, that's very slow.
> > > Are you using libcurl, or actually forking a new process for every  
> > > document?
> > > Are you committing on every document?
> > >
> > > If you can, using Java would make your life much easier since you
> > > could use the SolrJ client and it's binary protocol for indexing.
> > >
> > > -Yonik
> > > http://www.lucidimagination.com
> > >
> > 
> > 
> 
> 

-- 
Dan A. Dickey | Senior Software Engineer

Savvis
10900 Hampshire Ave. S., Bloomington, MN  55438
Office: 952.852.4803 | Fax: 952.852.4951
E-mail: dan.dic...@savvis.net


RE: Solr http post performance seems slow - help?

2009-09-10 Thread Walter Underwood
What kind of storage is used for the Solr index files? When I tested it, NFS
was 100X slower than local disk.

wunder 

-Original Message-
From: Dan A. Dickey [mailto:dan.dic...@savvis.net] 
Sent: Thursday, September 10, 2009 11:15 AM
To: solr-user@lucene.apache.org
Cc: Walter Underwood
Subject: Re: Solr http post performance seems slow - help?

On Thursday 10 September 2009 09:10:27 am Walter Underwood wrote:
> How big are your documents?

For the most part, I'm just indexing metadata that has been pulled from
the documents.  I think I have currently about 40 or so fields that I'm
setting.
When the document is an actual document - pdf, doc, etc... I use the DIH
to extract stuff and also set the metadata then.

> Is your index on local disk or network- 
> mounted disk?

I'm basically pulling the metadata info from a database and the documents
themselves are shared via NFS to the Solr indexer.
-Dan

> 
> wunder
> 
> On Sep 10, 2009, at 6:39 AM, Yonik Seeley wrote:
> 
> > On Thu, Sep 10, 2009 at 9:13 AM, Dan A. Dickey  
> >  wrote:
> >> I'm posting documents to Solr using http (curl) from
> >> C++/C code and am seeing approximately 3.3 - 3.4
> >> documents per second being posted.  Is this to be expected?
> >
> > No, that's very slow.
> > Are you using libcurl, or actually forking a new process for every  
> > document?
> > Are you committing on every document?
> >
> > If you can, using Java would make your life much easier since you
> > could use the SolrJ client and it's binary protocol for indexing.
> >
> > -Yonik
> > http://www.lucidimagination.com
> >
> 
> 

-- 
Dan A. Dickey | Senior Software Engineer

Savvis
10900 Hampshire Ave. S., Bloomington, MN  55438
Office: 952.852.4803 | Fax: 952.852.4951
E-mail: dan.dic...@savvis.net




Re: Default Query Type For Facet Queries

2009-09-10 Thread Stephen Duncan Jr
If using {!type=customparser} is the only way now, should I file an issue to
make the default configurable?

-- 
Stephen Duncan Jr
www.stephenduncanjr.com

On Thu, Sep 3, 2009 at 11:23 AM, Stephen Duncan Jr  wrote:

> We have a custom query parser plugin registered as the default for
> searches, and we'd like to have the same parser used for facet.query.
>
> Is there a way to register it as the default for FacetComponent in
> solrconfig.xml?
>
> I know I can add {!type=customparser} to each query as a workaround, but
> I'd rather register it in the config that make my code send that and strip
> it off on every facet query.
>
> --
> Stephen Duncan Jr
> www.stephenduncanjr.com
>


Re: Solr http post performance seems slow - help?

2009-09-10 Thread Dan A. Dickey
On Thursday 10 September 2009 09:10:27 am Walter Underwood wrote:
> How big are your documents?

For the most part, I'm just indexing metadata that has been pulled from
the documents.  I think I have currently about 40 or so fields that I'm setting.
When the document is an actual document - pdf, doc, etc... I use the DIH
to extract stuff and also set the metadata then.

> Is your index on local disk or network- 
> mounted disk?

I'm basically pulling the metadata info from a database and the documents
themselves are shared via NFS to the Solr indexer.
-Dan

> 
> wunder
> 
> On Sep 10, 2009, at 6:39 AM, Yonik Seeley wrote:
> 
> > On Thu, Sep 10, 2009 at 9:13 AM, Dan A. Dickey  
> >  wrote:
> >> I'm posting documents to Solr using http (curl) from
> >> C++/C code and am seeing approximately 3.3 - 3.4
> >> documents per second being posted.  Is this to be expected?
> >
> > No, that's very slow.
> > Are you using libcurl, or actually forking a new process for every  
> > document?
> > Are you committing on every document?
> >
> > If you can, using Java would make your life much easier since you
> > could use the SolrJ client and it's binary protocol for indexing.
> >
> > -Yonik
> > http://www.lucidimagination.com
> >
> 
> 

-- 
Dan A. Dickey | Senior Software Engineer

Savvis
10900 Hampshire Ave. S., Bloomington, MN  55438
Office: 952.852.4803 | Fax: 952.852.4951
E-mail: dan.dic...@savvis.net


Re: Solr http post performance seems slow - help?

2009-09-10 Thread Dan A. Dickey
On Thursday 10 September 2009 08:39:38 am Yonik Seeley wrote:
> On Thu, Sep 10, 2009 at 9:13 AM, Dan A. Dickey  wrote:
> > I'm posting documents to Solr using http (curl) from
> > C++/C code and am seeing approximately 3.3 - 3.4
> > documents per second being posted.  Is this to be expected?
> 
> No, that's very slow.
> Are you using libcurl, or actually forking a new process for every document?

I'm using libcurl and not forking.

> Are you committing on every document?

No.

> If you can, using Java would make your life much easier since you
> could use the SolrJ client and it's binary protocol for indexing.

As much as I'd like to, I can't.  At this point in time it would take far
too much code restructuring and rewriting.  There is a database involved,
and some senseless portability library being used - though we only run on
Linux at this point in time.  It's just too much work to switch over to using
Java, for now.
-Dan

-- 
Dan A. Dickey | Senior Software Engineer

Savvis
10900 Hampshire Ave. S., Bloomington, MN  55438
Office: 952.852.4803 | Fax: 952.852.4951
E-mail: dan.dic...@savvis.net


Re: query parser question

2009-09-10 Thread Yonik Seeley
On Thu, Sep 10, 2009 at 1:28 PM, Joe Calderon  wrote:
> i have field called text_stem that has a kstemmer on it, im having
> trouble matching wildcard searches on a word that got stemmed
>
> for example i index the word "america's", which according to
> analysis.jsp after stemming gets indexed as "america"
>
> when matching i do a query like myfield:(ame*) which matches the
> indexed term, this all works fine until the query becomes
> myfield:(america's*) at which point it doesnt match, however if i
> remove the wildcard like myfield:(america's) the it works again
>
> its almost like the term doesnt get stemmed when using a wildcard

Correct - it's not stemmed.  If it were stemmed, there would be
multiple cases where that wouldn't work either.

For example, with the porter stemmer, "any"->"ani" and "anywhere"->"anywher"

So if you had a document with "anywhere", a prefix query of "any*"
wouldn't work if you stemmed it, and would match other things like
"animal".

-Yonik
http://www.lucidimagination.com


RE: OutOfMemory error on solr 1.3

2009-09-10 Thread Francis Yakin
SO, do you think increasing the JVM will help? We also have 
500 in solrconfig.xml
Originally was set to 200

Currently we give solr 1.5GB for Xms and Xmx, we use jrockit version 1.5.0_15

4 S root 12543 12495 16  76   0 - 848974 184466 Jul20 ?   8-11:12:03 
/opt/bea/jrmc-3.0.3-1.5.0/bin/java -Xms1536m -Xmx1536m -Xns:128m -Xgc:gencon 
-Djavelin.jsp.el.elcache=4096 
-Dsolr.solr.home=/opt/apache-solr-1.3.0/example/solr

Francis

-Original Message-
From: Constantijn Visinescu [mailto:baeli...@gmail.com]
Sent: Wednesday, September 09, 2009 11:35 PM
To: solr-user@lucene.apache.org
Subject: Re: OutOfMemory error on solr 1.3

Just wondering, how much memory are you giving your JVM ?

On Thu, Sep 10, 2009 at 7:46 AM, Francis Yakin  wrote:

>
> I am having OutOfMemory error on our slaves server, I would like to know if
> someone has the same issue and have the solution for this.
>
> SEVERE: Error during auto-warming of
> key:org.apache.solr.search.queryresult...@96cd2ffc:java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 5395576, Num elements: 1348890
> SEVERE: java.lang.OutOfMemoryError: allocLargeObjectOrArray - Object size:
> 441216, Num elements: 55150
> SEVERE: Error during auto-warming of
> key:org.apache.solr.search.queryresult...@519116e0:java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 5395576, Num elements: 1348890
> SEVERE: Error during auto-warming of
> key:org.apache.solr.search.queryresult...@74dc52fa:java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 5395576, Num elements: 1348890
> SEVERE: Error during auto-warming of
> key:org.apache.solr.search.queryresult...@d0dd3e28:java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 5395576, Num elements: 1348890
> SEVERE: Error during auto-warming of
> key:org.apache.solr.search.queryresult...@b6dfa5bc:java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 14128832, Num elements: 3532204
> SEVERE: Error during auto-warming of
> key:org.apache.solr.search.queryresult...@482b13ef:java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 14128832, Num elements: 3532204
> SEVERE: Error during auto-warming of
> key:org.apache.solr.search.queryresult...@2309438c:java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 14128832, Num elements: 3532204
> SEVERE: Error during auto-warming of
> key:org.apache.solr.search.queryresult...@277bd48c:java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 14128832, Num elements: 3532204
> Exception in thread "[ACTIVE] ExecuteThread: '7' for queue:
> 'weblogic.kernel.Default (self-tuning)'" java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 8208, Num elements: 8192
> Exception in thread "[ACTIVE] ExecuteThread: '8' for queue:
> 'weblogic.kernel.Default (self-tuning)'" java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 8208, Num elements: 8192
> Exception in thread "[ACTIVE] ExecuteThread: '10' for queue:
> 'weblogic.kernel.Default (self-tuning)'" java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 8208, Num elements: 8192
> Exception in thread "[ACTIVE] ExecuteThread: '11' for queue:
> 'weblogic.kernel.Default (self-tuning)'" java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 8208, Num elements: 8192
> SEVERE: Error during auto-warming of
> key:org.apache.solr.search.queryresult...@41405463:java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 751552, Num elements: 187884
>  java.lang.OutOfMemoryError: allocLargeObjectOrArray - Object size: 8208,
> Num elements: 8192
> java.lang.OutOfMemoryError: allocLargeObjectOrArray - Object size: 8208,
> Num elements: 8192
> java.lang.OutOfMemoryError: allocLargeObjectOrArray - Object size: 5096,
> Num elements: 2539
> java.lang.OutOfMemoryError: allocLargeObjectOrArray - Object size: 5400,
> Num elements: 2690
>
>  deployment service message for request id "-1" from server "AdminServer".
> Exception is: "java.lang.OutOfMemoryError: allocLargeObjectOrArray - Object
> size: 4368, Num elements: 2174
> SEVERE: java.lang.OutOfMemoryError: allocLargeObjectOrArray - Object size:
> 14140768, Num elements: 3535188
> SEVERE: Error during auto-warming of
> key:org.apache.solr.search.queryresult...@8dbcc7ab:java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 5395576, Num elements: 1348890
> java.lang.OutOfMemoryError: allocLargeObjectOrArray - Object size: 5320,
> Num elements: 2649
> SEVERE: Error during auto-warming of
> key:org.apache.solr.search.queryresult...@4d0c6fc5:java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 751560, Num elements: 187885
> java.lang.OutOfMemoryError: allocLargeObjectOrArray - Object size: 16400,
> Num elements: 8192
> SEVERE: Error during auto-warming of
> key:org.apache.solr.search.queryresult...@fb6bac19:java.lang.OutOfMemoryError:
> allocLargeObjectOrArray - Object size: 14140904, Num elements: 3535222
> SEV

query parser question

2009-09-10 Thread Joe Calderon
i have field called text_stem that has a kstemmer on it, im having
trouble matching wildcard searches on a word that got stemmed

for example i index the word "america's", which according to
analysis.jsp after stemming gets indexed as "america"

when matching i do a query like myfield:(ame*) which matches the
indexed term, this all works fine until the query becomes
myfield:(america's*) at which point it doesnt match, however if i
remove the wildcard like myfield:(america's) the it works again

its almost like the term doesnt get stemmed when using a wildcard

im using 1.4 nightly, is this the correct behaviour, is there
something i should do differently?

in the mean time ive added "americas" as protected word in the
kstemmer but im afraid of more edge cases that will come up

--joe


Re: TermsComponent

2009-09-10 Thread Todd Benge
Thanks for the pointer.  Definitely appreciate the help.

Todd

On Thu, Sep 10, 2009 at 11:10 AM, Jay Hill  wrote:

> If you need an alternative to using the TermsComponent for auto-suggest,
> have a look at this blog on using EdgeNGrams instead of the TermsComponent.
>
>
> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
>
> -Jay
> http://www.lucidimagination.com
>
>
> On Wed, Sep 9, 2009 at 3:35 PM, Todd Benge  wrote:
>
> > We're using the StandardAnalyzer but I'm fairly certain that's not the
> > issue.
> >
> > In fact, I there doesn't appear to be any issue with Lucene or Solr.
>  There
> > are many instances of data in which users have removed the whitespace so
> > they have a high frequency which means they bubble to the top of the
> sort.
> > The result is that a search for a name shows a first and last name
> without
> > the whitespace.
> >
> > One thing I've noticed is that since TermsComponent is working on a
> single
> > Term, there doesn't seem to be a way to query against a phrase.  The same
> > example as above applies, so if you're querying for name it'd be prefered
> > to
> > get multi-term responses back if a first name matches.
> >
> > Any suggestions?
> >
> > Thanks for all the help.  It's much appreciated.
> >
> > Todd
> >
> >
> > On Wed, Sep 9, 2009 at 12:11 PM, Grant Ingersoll  > >wrote:
> >
> > > And what Analyzer are you using?  I'm guessing that your words are
> being
> > > split up during analysis, which is why you aren't seeing whitespace.
>  If
> > you
> > > want to keep the whitespace, you will need to use the String field type
> > or
> > > possibly the Keyword Analyzer.
> > >
> > > -Grant
> > >
> > >
> > > On Sep 9, 2009, at 11:06 AM, Todd Benge wrote:
> > >
> > >  It's set as Field.Store.YES, Field.Index.ANALYZED.
> > >>
> > >>
> > >>
> > >> On Wed, Sep 9, 2009 at 8:15 AM, Grant Ingersoll 
> > >> wrote:
> > >>
> > >>  How are you tokenizing/analyzing the field you are accessing?
> > >>>
> > >>>
> > >>> On Sep 9, 2009, at 8:49 AM, Todd Benge wrote:
> > >>>
> > >>> Hi Rekha,
> > >>>
> > 
> >  Here's teh link to the TermsComponent info:
> > 
> >  http://wiki.apache.org/solr/TermsComponent
> > 
> >  and another link Matt Weber did on autocompletion:
> > 
> > 
> > 
> > 
> >
> http://www.mattweber.org/2009/05/02/solr-autosuggest-with-termscomponent-and-jquery/
> > 
> >  We had to upgrade to the latest nightly to get the TermsComponent to
> >  work.
> > 
> >  Good Luck!
> > 
> >  Todd
> > 
> >  On Wed, Sep 9, 2009 at 5:17 AM, dharhsana <
> rekha.dharsh...@gmail.com>
> >  wrote:
> > 
> > 
> >   Hi,
> > >
> > > I have a requirement on Autocompletion search , iam using solr 1.4.
> > >
> > > Could you please tell me how you worked on that Terms component
> using
> > > solr
> > > 1.4,
> > > i could'nt find terms component in solr 1.4 which i have
> > downloaded,is
> > > there
> > > anyother configuration should be done.
> > >
> > > Do you have code for autocompletion, please share wih me..
> > >
> > > Regards
> > > Rekha
> > >
> > >
> > >
> > > tbenge wrote:
> > >
> > >
> > >> Hi,
> > >>
> > >> I was looking at TermsComponent in Solr 1.4 as a way of building a
> > >> autocomplete function.  I have a prototype working but noticed
> that
> > >> terms
> > >> that have whitespace in them when indexed are absent the
> whitespace
> > >> when
> > >> returned from the TermsComponent.
> > >>
> > >> Any ideas on why that may be happening?  Am I just missing a
> > >>
> > >>  configuration
> > >
> > >  option?
> > >>
> > >> Thanks,
> > >>
> > >> Todd
> > >>
> > >>
> > >>
> > >>  --
> > > View this message in context:
> > > http://www.nabble.com/TermsComponent-tp25302503p25362829.html
> > > Sent from the Solr - User mailing list archive at Nabble.com.
> > >
> > >
> > >
> > >  --
> > >>> Grant Ingersoll
> > >>> http://www.lucidimagination.com/
> > >>>
> > >>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
> > using
> > >>> Solr/Lucene:
> > >>> http://www.lucidimagination.com/search
> > >>>
> > >>>
> > >>>
> > > --
> > > Grant Ingersoll
> > > http://www.lucidimagination.com/
> > >
> > > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
> using
> > > Solr/Lucene:
> > > http://www.lucidimagination.com/search
> > >
> > >
> >
>


Re: TermsComponent

2009-09-10 Thread Jay Hill
If you need an alternative to using the TermsComponent for auto-suggest,
have a look at this blog on using EdgeNGrams instead of the TermsComponent.

http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

-Jay
http://www.lucidimagination.com


On Wed, Sep 9, 2009 at 3:35 PM, Todd Benge  wrote:

> We're using the StandardAnalyzer but I'm fairly certain that's not the
> issue.
>
> In fact, I there doesn't appear to be any issue with Lucene or Solr.  There
> are many instances of data in which users have removed the whitespace so
> they have a high frequency which means they bubble to the top of the sort.
> The result is that a search for a name shows a first and last name without
> the whitespace.
>
> One thing I've noticed is that since TermsComponent is working on a single
> Term, there doesn't seem to be a way to query against a phrase.  The same
> example as above applies, so if you're querying for name it'd be prefered
> to
> get multi-term responses back if a first name matches.
>
> Any suggestions?
>
> Thanks for all the help.  It's much appreciated.
>
> Todd
>
>
> On Wed, Sep 9, 2009 at 12:11 PM, Grant Ingersoll  >wrote:
>
> > And what Analyzer are you using?  I'm guessing that your words are being
> > split up during analysis, which is why you aren't seeing whitespace.  If
> you
> > want to keep the whitespace, you will need to use the String field type
> or
> > possibly the Keyword Analyzer.
> >
> > -Grant
> >
> >
> > On Sep 9, 2009, at 11:06 AM, Todd Benge wrote:
> >
> >  It's set as Field.Store.YES, Field.Index.ANALYZED.
> >>
> >>
> >>
> >> On Wed, Sep 9, 2009 at 8:15 AM, Grant Ingersoll 
> >> wrote:
> >>
> >>  How are you tokenizing/analyzing the field you are accessing?
> >>>
> >>>
> >>> On Sep 9, 2009, at 8:49 AM, Todd Benge wrote:
> >>>
> >>> Hi Rekha,
> >>>
> 
>  Here's teh link to the TermsComponent info:
> 
>  http://wiki.apache.org/solr/TermsComponent
> 
>  and another link Matt Weber did on autocompletion:
> 
> 
> 
> 
> http://www.mattweber.org/2009/05/02/solr-autosuggest-with-termscomponent-and-jquery/
> 
>  We had to upgrade to the latest nightly to get the TermsComponent to
>  work.
> 
>  Good Luck!
> 
>  Todd
> 
>  On Wed, Sep 9, 2009 at 5:17 AM, dharhsana 
>  wrote:
> 
> 
>   Hi,
> >
> > I have a requirement on Autocompletion search , iam using solr 1.4.
> >
> > Could you please tell me how you worked on that Terms component using
> > solr
> > 1.4,
> > i could'nt find terms component in solr 1.4 which i have
> downloaded,is
> > there
> > anyother configuration should be done.
> >
> > Do you have code for autocompletion, please share wih me..
> >
> > Regards
> > Rekha
> >
> >
> >
> > tbenge wrote:
> >
> >
> >> Hi,
> >>
> >> I was looking at TermsComponent in Solr 1.4 as a way of building a
> >> autocomplete function.  I have a prototype working but noticed that
> >> terms
> >> that have whitespace in them when indexed are absent the whitespace
> >> when
> >> returned from the TermsComponent.
> >>
> >> Any ideas on why that may be happening?  Am I just missing a
> >>
> >>  configuration
> >
> >  option?
> >>
> >> Thanks,
> >>
> >> Todd
> >>
> >>
> >>
> >>  --
> > View this message in context:
> > http://www.nabble.com/TermsComponent-tp25302503p25362829.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
> >
> >
> >  --
> >>> Grant Ingersoll
> >>> http://www.lucidimagination.com/
> >>>
> >>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
> using
> >>> Solr/Lucene:
> >>> http://www.lucidimagination.com/search
> >>>
> >>>
> >>>
> > --
> > Grant Ingersoll
> > http://www.lucidimagination.com/
> >
> > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
> > Solr/Lucene:
> > http://www.lucidimagination.com/search
> >
> >
>


Re: Pagination with solr json data

2009-09-10 Thread Jay Hill
All you have to do is use the "start" and "rows" parameters to get the
results you want. For example, the query for the first page of results might
look like this,
?q=solr&start=0&rows=10 (other params omitted). So you'll start at the
beginning (0) and get 10 results. They next page would be
?q=solr&start=10&rows=10 - start at the 10th result and display the next 10
rows. Then ?q=solr&start=20&rows=10, and so on.

-Jay
http://www.lucidimagination.com


On Wed, Sep 9, 2009 at 12:24 PM, Elaine Li  wrote:

> Hi,
>
> What is the best way to do pagination?
>
> I searched around and only found some YUI utilities can do this. But
> their examples don't have very close match to the pattern I have in
> mind. I would like to have pretty plain display, something like the
> search results from google.
>
> Thanks.
>
> Elaine
>


Re: Passing FuntionQuery string parameters

2009-09-10 Thread wojtekpia

It looks like parseArg was added on Aug 20, 2009. I'm working with slightly
older code. Thanks!


Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> did you implement your own ValueSourceParser . the
> FunctionQParser#parseArg() method supports strings
> 
> On Wed, Sep 9, 2009 at 12:10 AM, wojtekpia wrote:
>>
>> Hi,
>>
>> I'm writing a function query to score documents based on Levenshtein
>> distance from a string. I want my function calls to look like:
>>
>> lev(myFieldName, 'my string to match')
>>
>> I'm running into trouble parsing the string I want to match ('my string
>> to
>> match' above). It looks like all the built in support is for parsing
>> field
>> names and numeric values. Am I missing the string parsing support, or is
>> it
>> not there, and if not, why?
>>
>> Thanks,
>>
>> Wojtek
>> --
>> View this message in context:
>> http://www.nabble.com/Passing-FuntionQuery-string-parameters-tp25351825p25351825.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> -
> Noble Paul | Principal Engineer| AOL | http://aol.com
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Passing-FuntionQuery-string-parameters-tp25351825p25386910.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Backups using Replication

2009-09-10 Thread wojtekpia

I'm using trunk from July 8, 2009. Do you know if it's more recent than that?


Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> which version of Solr are you using? the "backupAfter" name was
> introduced recently
> 

-- 
View this message in context: 
http://www.nabble.com/Backups-using-Replication-tp25350083p25386886.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Re : Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
https://issues.apache.org/jira/browse/SOLR-1421

2009/9/10 Noble Paul നോബിള്‍  नोब्ळ् :
> I guess there is a bug. I shall raise an issue.
>
>
>
> 2009/9/10 Noble Paul നോബിള്‍  नोब्ळ् :
>> everything looks fine and it beats me completely. I guess you will
>> have to debug this
>>
>> On Thu, Sep 10, 2009 at 6:17 PM, nourredine khadri
>>  wrote:
>>> Some fields are null but not the one parsed by XPathEntityProcessor (named 
>>> XML)
>>>
>>> 10 sept. 2009 14:40:34 org.apache.solr.handler.dataimport.LogTransformer 
>>> transformRow
>>> FIN: Map content : {KEYWORDS=pub, SPECIFIC=null, FATHERSID=, CONTAINERID=, 
>>> ARCHIVEDDATE=0, SITE=12308, LANGUAGE=null, ARCHIVESTATE=false, 
>>> OFFLINEATDATE=0, ONLINEATDATE=1026307864230, STATUS=0, 
>>> DATESTATUS=1113905585726, MODEL=0, ACTIVATIONSTATE=true, 
>>> MOUNTED_SITE_IDS=null, SPECIFIC_XML=null, PUBLICATIONSTATE=true, XML=>> version="1.0" encoding="ISO-8859-1"?>   >> Template="Article" Ref="10">   Empty Subtitle - Click Here 
>>> to edit   Empty Title - Click Here to 
>>> edit   Empty Chap¶ - Click Here to 
>>> edit   Empty Autor - Click Here to 
>>> edit   Empty Catchword - Click Here to 
>>> edit   Empty InterTitle - Cl
>>> ick Here to edit TextEmpty Paragraph - Click Here 
>>> to edit Text        
>>> , IDENTIFIERVERSION=5040052, CONTENTID=5040052}
>>> 10 sept. 2009 14:40:34 org.apache.solr.handler.dataimport.DocBuilder 
>>> buildDocument
>>> GRAVE: Exception while processing: xml_document document : 
>>> SolrInputDocument[{keywords=keywords(1.0)={pub}, 
>>> fathersId=fathersId(1.0)={}, containerId=containerId(1.0)={}, 
>>> site=site(1.0)={12308}, archiveState=archiveState(1.0)={false}, 
>>> offlineAtDate=offlineAtDate(1.0)={0}, 
>>> onlineAtDate=onlineAtDate(1.0)={1026307864230}, status=status(1.0)={0}, 
>>> dateStatus=dateStatus(1.0)={1113905585726}, model=model(1.0)={0}, 
>>> activationState=activationState(1.0)={true}, 
>>> publicationState=publicationState(1.0)={true}, xml=xml(1.0)={>> version="1.0" encoding="ISO-8859-1"?>   >> Template="Article" Ref="10">   Empty Subtitle - Click Here 
>>> to edit   Empty Title - Click Here to 
>>> edit   Empty Chap¶ - Click Here to edit<
>>> /Parag>   Empty Autor - Click Here to edit   
>>> Empty Catchword - Click Here to edit   
>>> Empty InterTitle - Click Here to edit 
>>> TextEmpty Paragraph - Click Here to edit 
>>> Text        
>>> }, identifierversion=identifierversion(1.0)={5040052}, 
>>> contentid=contentid(1.0)={5040052}}]
>>> org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing 
>>> failed for xml, url:null rows processed:0 Processing Document # 1
>>>        at 
>>> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
>>>        at 
>>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:292)
>>>        at 
>>> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:187)
>>>        at 
>>> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:164)
>>>        at 
>>> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237)
>>>        at 
>>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
>>>        at 
>>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:365)
>>>        at 
>>> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:259)
>>>        at 
>>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159)
>>>        at 
>>> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:354)
>>>        at 
>>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:395)
>>>        at 
>>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
>>> Caused by: java.lang.RuntimeException: java.lang.NullPointerException
>>>        at 
>>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:92)
>>>        at 
>>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:282)
>>>        ... 10 more
>>> Caused by: java.lang.NullPointerException
>>>        at 
>>> com.ctc.wstx.io.ReaderBootstrapper.initialLoad(ReaderBootstrapper.java:245)
>>>        at 
>>> com.ctc.wstx.io.ReaderBootstrapper.bootstrapInput(ReaderBootstrapper.java:132)
>>>        at 
>>> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:543)
>>>        at 
>>> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:604)
>>>        at 
>>> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:660)
>>>        at 
>>> com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:331)
>>>        at 
>>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:88)
>>>        ... 11 more
>>> 10 sept. 2009 14:40:34 org.apa

Re: Re : Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
I guess there is a bug. I shall raise an issue.



2009/9/10 Noble Paul നോബിള്‍  नोब्ळ् :
> everything looks fine and it beats me completely. I guess you will
> have to debug this
>
> On Thu, Sep 10, 2009 at 6:17 PM, nourredine khadri
>  wrote:
>> Some fields are null but not the one parsed by XPathEntityProcessor (named 
>> XML)
>>
>> 10 sept. 2009 14:40:34 org.apache.solr.handler.dataimport.LogTransformer 
>> transformRow
>> FIN: Map content : {KEYWORDS=pub, SPECIFIC=null, FATHERSID=, CONTAINERID=, 
>> ARCHIVEDDATE=0, SITE=12308, LANGUAGE=null, ARCHIVESTATE=false, 
>> OFFLINEATDATE=0, ONLINEATDATE=1026307864230, STATUS=0, 
>> DATESTATUS=1113905585726, MODEL=0, ACTIVATIONSTATE=true, 
>> MOUNTED_SITE_IDS=null, SPECIFIC_XML=null, PUBLICATIONSTATE=true, XML=> version="1.0" encoding="ISO-8859-1"?>   > Template="Article" Ref="10">   Empty Subtitle - Click Here 
>> to edit   Empty Title - Click Here to 
>> edit   Empty Chap¶ - Click Here to 
>> edit   Empty Autor - Click Here to edit 
>>   Empty Catchword - Click Here to edit   
>> Empty InterTitle - Cl
>> ick Here to edit TextEmpty Paragraph - Click Here 
>> to edit Text        
>> , IDENTIFIERVERSION=5040052, CONTENTID=5040052}
>> 10 sept. 2009 14:40:34 org.apache.solr.handler.dataimport.DocBuilder 
>> buildDocument
>> GRAVE: Exception while processing: xml_document document : 
>> SolrInputDocument[{keywords=keywords(1.0)={pub}, 
>> fathersId=fathersId(1.0)={}, containerId=containerId(1.0)={}, 
>> site=site(1.0)={12308}, archiveState=archiveState(1.0)={false}, 
>> offlineAtDate=offlineAtDate(1.0)={0}, 
>> onlineAtDate=onlineAtDate(1.0)={1026307864230}, status=status(1.0)={0}, 
>> dateStatus=dateStatus(1.0)={1113905585726}, model=model(1.0)={0}, 
>> activationState=activationState(1.0)={true}, 
>> publicationState=publicationState(1.0)={true}, xml=xml(1.0)={> version="1.0" encoding="ISO-8859-1"?>   > Template="Article" Ref="10">   Empty Subtitle - Click Here 
>> to edit   Empty Title - Click Here to 
>> edit   Empty Chap¶ - Click Here to edit<
>> /Parag>   Empty Autor - Click Here to edit   
>> Empty Catchword - Click Here to edit   
>> Empty InterTitle - Click Here to edit 
>> TextEmpty Paragraph - Click Here to edit 
>> Text        
>> }, identifierversion=identifierversion(1.0)={5040052}, 
>> contentid=contentid(1.0)={5040052}}]
>> org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing 
>> failed for xml, url:null rows processed:0 Processing Document # 1
>>        at 
>> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
>>        at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:292)
>>        at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:187)
>>        at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:164)
>>        at 
>> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237)
>>        at 
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
>>        at 
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:365)
>>        at 
>> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:259)
>>        at 
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159)
>>        at 
>> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:354)
>>        at 
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:395)
>>        at 
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
>> Caused by: java.lang.RuntimeException: java.lang.NullPointerException
>>        at 
>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:92)
>>        at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:282)
>>        ... 10 more
>> Caused by: java.lang.NullPointerException
>>        at 
>> com.ctc.wstx.io.ReaderBootstrapper.initialLoad(ReaderBootstrapper.java:245)
>>        at 
>> com.ctc.wstx.io.ReaderBootstrapper.bootstrapInput(ReaderBootstrapper.java:132)
>>        at 
>> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:543)
>>        at 
>> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:604)
>>        at 
>> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:660)
>>        at 
>> com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:331)
>>        at 
>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:88)
>>        ... 11 more
>> 10 sept. 2009 14:40:34 org.apache.solr.handler.dataimport.DataImporter 
>> doDeltaImport
>> GRAVE: Delta Import Failed
>> org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing 
>> failed for xml, url:null

Re: Field Collapsing (was Re: Schema for group/child entity setup)

2009-09-10 Thread Uri Boness

The current patch definitely supports facet before and after the collapsing.

Stephen Weiss wrote:
I just noticed this and it reminded me of an issue I've had with 
collapsed faceting with an older version of the patch in Solr 1.3.  
Would it be possible, if we can get the terms for all the collapsed 
documents on a field, to then facet each collapsed document on the 
unique terms it has collectively?  What I mean is for example:


Doc 1, 2, 3 collapse together on some other field

Doc 1 is the "main document" and has the "colors" blue and red
Doc 2 has red
Doc 3 has green

For the purposes of faceting, it would be ideal in our case for 
faceting on color to count one each for blue, red, and green on this 
document (the user drills down on this value to yet another collapsed 
set).  Right now, when you facet after collapse you just get blue and 
red (green is dropped because it collapses out).  To the user it makes 
the counts seem inaccurate, like they're missing something.  Instead 
we facet before collapsing and get an "inflated" value (which ticks 2 
for red - but when you drill down, you still only get 1 because Doc 1 
and Doc 2 collapse together again).  Either way it's not ideal.


At the time (many months ago) there was no way to account for this but 
it sounds like this patch could make it possible, maybe.


Thanks!

--
Steve

On Sep 5, 2009, at 5:57 AM, Uri Boness wrote:

There's work on the patch that is being done now which will enable 
you to ask for specific field values of the collapsed documents using 
a dedicated request parameter. This work is not committed yet to the 
latest patch, but will be very soon. There is of course a drawback to 
that as well, the collapsed documents set can be very large (depends 
on your data of course) in which case the returned result which 
includes the fields values can be rather large, which will impact 
performance, this is why this feature will be enabled only if you 
specify this extra parameter - by default no field values will be 
returned.


AFAIK, the latest patch should work fine with the latest build. 
Martijn (which is the main maintainer of this patch) tries to keep it 
up to date with the latest builds. But I guess the safest way is to 
work with the nightly build of the same date as the latest patch 
(though I would give it a try first with the latest build).


BTW, it's not an official suggestion from the Solr development team, 
but if you ask me, if you have to choose now whether to use 1.3 or 
1.4-dev, I would go for the later. 1.4 is supposed to be released in 
the upcoming week or two and it bring loads of bug fixes, 
enhancements and extra functionality. But again, this is my personal 
suggestion.


cheers,
Uri





Facet fields and the DisMax query handler

2009-09-10 Thread Villemos, Gert
I'm trying to understand the DisMax query handler. I orginally
configured it to ensure that the query was mapped onto different fields
in the documents and a boost assigned if the fields match. And that
works pretty smoothly.
 
However when it comes to facetted searches the results perplexes me.
Consider the following example;
 
Document A:
John Doe
 
Document B:
John Doe
 
The following queries does not return anything;
Staff:Doe
Staff:Doe*
Staff:John
Staff:John*
 
The query;
Staff:"John"
 
Returns Document A and B, even though document B doesnt even contain the
field 'Staff' (which is optional)! Through the "qf" field dismax has
been configured to search over the field 'ProjectManager' but I expected
the usage of a facet value would exclude the field... Looking at the
score of the documents, document A does score much higher than Document
B (a factor 20) but I would expect not to see B at all. I have changed
the dismax configuration minimum match to be 1, to ensure that all hits
with a single match is returned without effect. I have changed the tie
to 0 with no effect.
 
What am I missing here? I would like queries such as 'Staff:Doe' to
return document A, and only A.
 
Cheers,
Gert.
 


Please help Logica to respect the environment by not printing this email  / 
Pour contribuer comme Logica au respect de l'environnement, merci de ne pas 
imprimer ce mail /  Bitte drucken Sie diese Nachricht nicht aus und helfen Sie 
so Logica dabei, die Umwelt zu schützen. /  Por favor ajude a Logica a 
respeitar o ambiente nao imprimindo este correio electronico.



This e-mail and any attachment is for authorised use by the intended 
recipient(s) only. It may contain proprietary material, confidential 
information and/or be subject to legal privilege. It should not be copied, 
disclosed to, retained or used by, any other party. If you are not an intended 
recipient then please promptly delete this e-mail and any attachment and all 
copies and inform the sender. Thank you.



Re: solr 1.3 and multicore data directory

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
you do not have to make 3 copies of conf dir even in Solr1.3

you can try this

${./solr/${solr.core.name}/data}



On Thu, Sep 10, 2009 at 7:55 PM, Paul Rosen  wrote:
> Ok. I have a workaround for now. I've duplicated the conf folder three times
> and changed this line in solrconfig.xml in each folder:
>
>  ${solr.data.dir:./solr/exhibits/data}
>
> I can't wait for solr 1.4!
>
> Noble Paul നോബിള്‍ नोब्ळ् wrote:
>>
>> the dataDir is a Solr1.4 feature
>>
>> On Thu, Sep 10, 2009 at 1:57 AM, Paul Rosen 
>> wrote:
>>>
>>> Hi All,
>>>
>>> I'm trying to set up solr 1.3 to use multicore but I'm getting some
>>> puzzling
>>> results. My solr.xml file is:
>>>
>>> 
>>> 
>>>  
>>>  >> dataDir="solr/resources/data/" />
>>>  >> dataDir="solr/exhibits/data/"
>>> />
>>>  >> dataDir="solr/reindex_resources/data/" />
>>>  
>>> 
>>>
>>> When I start up solr, everything looks normal until I get this line in
>>> the
>>> log:
>>>
>>> INFO: [resources] Opening new SolrCore at solr/resources/,
>>> dataDir=./solr/data/
>>>
>>> And a new folder is created ./solr/data/index with a blank index. And, of
>>> course, any queries go to that blank index and not to one of my cores.
>>>
>>> Actually, what I'd really like is to have my directory structure look
>>> like
>>> this (some items removed for brevity):
>>>
>>> -
>>> solr_1.3
>>>   lib
>>>   solr
>>>       solr.xml
>>>       bin
>>>       conf
>>>       data
>>>           resources
>>>               index
>>>           exhibits
>>>               index
>>>           reindex_resources
>>>               index
>>> start.jar
>>> -
>>>
>>> And have all the cores share everything except an index.
>>>
>>> How would I set that up?
>>>
>>> Are there differences between 1.3 and 1.4 in this respect?
>>>
>>> Thanks,
>>> Paul
>>>
>>
>>
>>
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: Re : Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
everything looks fine and it beats me completely. I guess you will
have to debug this

On Thu, Sep 10, 2009 at 6:17 PM, nourredine khadri
 wrote:
> Some fields are null but not the one parsed by XPathEntityProcessor (named 
> XML)
>
> 10 sept. 2009 14:40:34 org.apache.solr.handler.dataimport.LogTransformer 
> transformRow
> FIN: Map content : {KEYWORDS=pub, SPECIFIC=null, FATHERSID=, CONTAINERID=, 
> ARCHIVEDDATE=0, SITE=12308, LANGUAGE=null, ARCHIVESTATE=false, 
> OFFLINEATDATE=0, ONLINEATDATE=1026307864230, STATUS=0, 
> DATESTATUS=1113905585726, MODEL=0, ACTIVATIONSTATE=true, 
> MOUNTED_SITE_IDS=null, SPECIFIC_XML=null, PUBLICATIONSTATE=true, XML= version="1.0" encoding="ISO-8859-1"?>    Ref="10">   Empty Subtitle - Click Here to 
> edit   Empty Title - Click Here to 
> edit   Empty Chap¶ - Click Here to 
> edit   Empty Autor - Click Here to edit  
>  Empty Catchword - Click Here to edit   
> Empty InterTitle - Cl
> ick Here to edit TextEmpty Paragraph - Click Here 
> to edit Text        
> , IDENTIFIERVERSION=5040052, CONTENTID=5040052}
> 10 sept. 2009 14:40:34 org.apache.solr.handler.dataimport.DocBuilder 
> buildDocument
> GRAVE: Exception while processing: xml_document document : 
> SolrInputDocument[{keywords=keywords(1.0)={pub}, fathersId=fathersId(1.0)={}, 
> containerId=containerId(1.0)={}, site=site(1.0)={12308}, 
> archiveState=archiveState(1.0)={false}, offlineAtDate=offlineAtDate(1.0)={0}, 
> onlineAtDate=onlineAtDate(1.0)={1026307864230}, status=status(1.0)={0}, 
> dateStatus=dateStatus(1.0)={1113905585726}, model=model(1.0)={0}, 
> activationState=activationState(1.0)={true}, 
> publicationState=publicationState(1.0)={true}, xml=xml(1.0)={ version="1.0" encoding="ISO-8859-1"?>    Ref="10">   Empty Subtitle - Click Here to 
> edit   Empty Title - Click Here to 
> edit   Empty Chap¶ - Click Here to edit<
> /Parag>   Empty Autor - Click Here to edit   
> Empty Catchword - Click Here to edit   
> Empty InterTitle - Click Here to edit 
> TextEmpty Paragraph - Click Here to edit 
> Text        
> }, identifierversion=identifierversion(1.0)={5040052}, 
> contentid=contentid(1.0)={5040052}}]
> org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing failed 
> for xml, url:null rows processed:0 Processing Document # 1
>        at 
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
>        at 
> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:292)
>        at 
> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:187)
>        at 
> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:164)
>        at 
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237)
>        at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
>        at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:365)
>        at 
> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:259)
>        at 
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159)
>        at 
> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:354)
>        at 
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:395)
>        at 
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
> Caused by: java.lang.RuntimeException: java.lang.NullPointerException
>        at 
> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:92)
>        at 
> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:282)
>        ... 10 more
> Caused by: java.lang.NullPointerException
>        at 
> com.ctc.wstx.io.ReaderBootstrapper.initialLoad(ReaderBootstrapper.java:245)
>        at 
> com.ctc.wstx.io.ReaderBootstrapper.bootstrapInput(ReaderBootstrapper.java:132)
>        at 
> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:543)
>        at 
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:604)
>        at 
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:660)
>        at 
> com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:331)
>        at 
> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:88)
>        ... 11 more
> 10 sept. 2009 14:40:34 org.apache.solr.handler.dataimport.DataImporter 
> doDeltaImport
> GRAVE: Delta Import Failed
> org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing failed 
> for xml, url:null rows processed:0 Processing Document # 1
>        at 
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
>        at 
> org.apache.solr.handler.dataimport.XPathEntityProces

Re: Field Collapsing (was Re: Schema for group/child entity setup)

2009-09-10 Thread Uri Boness
All work and progress on this patch is done under the JIRA issue: 
https://issues.apache.org/jira/browse/SOLR-236



R. Tan wrote:

The patch which will be committed soon will add this functionality.




Where can I follow the progress of this patch?


On Mon, Sep 7, 2009 at 3:38 PM, Uri Boness  wrote:

  

Great. Nice site and very similar to my requirements.

  

thanks.

 So, right now, you get all field values by default?

Right now, no field values are returned for the collapsed documents. The

patch which will be committed soon will add this functionality.


R. Tan wrote:



Great. Nice site and very similar to my requirements.



  

There's work on the patch that is being done now which will enable you to
ask for specific field values of the collapsed documents using a
dedicated
request parameter.




So, right now, you get all field values by default?


On Sun, Sep 6, 2009 at 3:58 AM, Uri Boness  wrote:



  

You can check out http://www.ilocal.nl. If you search for a bank in
Amsterdam then you'll see that a lot of the results are collapsed. For
this
we used an older version of this patch (which works on 1.3) but a lot has
changed since then. We're currently using this patch on another project,
but
it's not live yet.


Uri

R. Tan wrote:





Thanks Uri. Your personal suggestion is appreciated and I think I'll
follow
your advice. We're still early in development and 1.4 would be a good
choice. I hope I can get field collapsing to work with my requirements.
Do
you know any live site using field collapsing already?

On Sat, Sep 5, 2009 at 5:57 PM, Uri Boness  wrote:





  

There's work on the patch that is being done now which will enable you
to
ask for specific field values of the collapsed documents using a
dedicated
request parameter. This work is not committed yet to the latest patch,
but
will be very soon. There is of course a drawback to that as well, the
collapsed documents set can be very large (depends on your data of
course)
in which case the returned result which includes the fields values can
be
rather large, which will impact performance, this is why this feature
will
be enabled only if you specify this extra parameter - by default no
field
values will be returned.

AFAIK, the latest patch should work fine with the latest build. Martijn
(which is the main maintainer of this patch) tries to keep it up to
date
with the latest builds. But I guess the safest way is to work with the
nightly build of the same date as the latest patch (though I would give
it a
try first with the latest build).

BTW, it's not an official suggestion from the Solr development team,
but
if
you ask me, if you have to choose now whether to use 1.3 or 1.4-dev, I
would
go for the later. 1.4 is supposed to be released in the upcoming week
or
two
and it bring loads of bug fixes, enhancements and extra functionality.
But
again, this is my personal suggestion.


cheers,
Uri

R. Tan wrote:







Okay. Thanks for giving an insight on how it works in general. Without
trying it myself, are the field values for the collapsed ones also
part
of
the results data?
What is the latest build that is safe to use on a production
environment?
I'd probably go for that and use field collapsing.

Thank you very much.


On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness  wrote:







  

The collapsed documents are represented by one "master" document
which
can
be part of the normal search result (the doc list), so pagination
just
works
as expected, meaning taking only the returned documents in account
(ignoring
the collapsed ones). As for the scoring, the "master" document is
actually
the document with the highest score in the collapsed group.

As for Solr 1.3 compatibility... well... it's very hart to tell. All
latest
patch are certainly *not* 1.3 compatible (I think they're also
depending
on
some changes in lucene which are not available for solr 1.3). I guess
you'll
have to try some of the old patches, but I'm not sure about their
stability.

cheers,
Uri


R. Tan wrote:









Thanks Uri. How does paging and scoring work when using field
collapsing?
What patch works with 1.3? Is it production ready?

R


On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness 
wrote:









  

The development on this patch is quite active. It works well for
single
solr instance, but distributed search (ie. shards) is not yet
supported.
Using this page you can group search results based on a specific
field.
There are two flavors of field collapsing - adjacent and
non-adjacent,
the
former collapses only document which happen to be located next to
each
other
in the otherwise-non-collapsed results set. The later (the
non-adjacent)
one
collapses all documents with the same field value (regardless of
their
position in the otherwise-non-collapsed results set). Note, that
non-adjacent performs better than adjacent one. There's currently
discussion
to extend this support

Re: solr 1.3 and multicore data directory

2009-09-10 Thread Paul Rosen
Ok. I have a workaround for now. I've duplicated the conf folder three 
times and changed this line in solrconfig.xml in each folder:


  ${solr.data.dir:./solr/exhibits/data}

I can't wait for solr 1.4!

Noble Paul നോബിള്‍ नोब्ळ् wrote:

the dataDir is a Solr1.4 feature

On Thu, Sep 10, 2009 at 1:57 AM, Paul Rosen  wrote:

Hi All,

I'm trying to set up solr 1.3 to use multicore but I'm getting some puzzling
results. My solr.xml file is:



 
 
 
 
 


When I start up solr, everything looks normal until I get this line in the
log:

INFO: [resources] Opening new SolrCore at solr/resources/,
dataDir=./solr/data/

And a new folder is created ./solr/data/index with a blank index. And, of
course, any queries go to that blank index and not to one of my cores.

Actually, what I'd really like is to have my directory structure look like
this (some items removed for brevity):

-
solr_1.3
   lib
   solr
   solr.xml
   bin
   conf
   data
   resources
   index
   exhibits
   index
   reindex_resources
   index
start.jar
-

And have all the cores share everything except an index.

How would I set that up?

Are there differences between 1.3 and 1.4 in this respect?

Thanks,
Paul









Re: Solr http post performance seems slow - help?

2009-09-10 Thread Walter Underwood
How big are your documents? Is your index on local disk or network- 
mounted disk?


wunder

On Sep 10, 2009, at 6:39 AM, Yonik Seeley wrote:

On Thu, Sep 10, 2009 at 9:13 AM, Dan A. Dickey  
 wrote:

I'm posting documents to Solr using http (curl) from
C++/C code and am seeing approximately 3.3 - 3.4
documents per second being posted.  Is this to be expected?


No, that's very slow.
Are you using libcurl, or actually forking a new process for every  
document?

Are you committing on every document?

If you can, using Java would make your life much easier since you
could use the SolrJ client and it's binary protocol for indexing.

-Yonik
http://www.lucidimagination.com





Re: Solr http post performance seems slow - help?

2009-09-10 Thread Yonik Seeley
On Thu, Sep 10, 2009 at 9:13 AM, Dan A. Dickey  wrote:
> I'm posting documents to Solr using http (curl) from
> C++/C code and am seeing approximately 3.3 - 3.4
> documents per second being posted.  Is this to be expected?

No, that's very slow.
Are you using libcurl, or actually forking a new process for every document?
Are you committing on every document?

If you can, using Java would make your life much easier since you
could use the SolrJ client and it's binary protocol for indexing.

-Yonik
http://www.lucidimagination.com


RE: Extract info from parent node during data import

2009-09-10 Thread Fergus McMenemie
>Hi Paul,
>The forEach="/document/category/item | /document/category/name" didn't work 
>(no categoryname was stored or indexed).
>However forEach="/document/category/item | /document/category" seems to work 
>well. I am not sure why category on its own works, but not category/name...
>But thanks for tip. It wasn't as painful as I thought it would be.
>Venn

Hmmm, I had bother with this. Although each occurance of 
/document/category/item 
causes a new solr document to indexed, that document contained all the fields 
from
the parent element as well.

Did you see this?

>
>> From: noble.p...@corp.aol.com
>> Date: Thu, 10 Sep 2009 09:58:21 +0530
>> Subject: Re: Extract info from parent node during data import
>> To: solr-user@lucene.apache.org
>> 
>> try this
>> 
>> add two xpaths in your forEach
>> 
>> forEach="/document/category/item | /document/category/name"
>> 
>> and add a field as follows
>> 
>> > commonField="true"/>
>> 
>> Please try it out and let me know.
>> 
>> On Thu, Sep 10, 2009 at 7:30 AM, venn hardy  wrote:
>> >
>> > Hello,
>> >
>> >
>> >
>> > I am using SOLR 1.4 (from nighly build) and its URLDataSource in 
>> > conjunction with the XPathEntityProcessor. I have successfully imported 
>> > XML content, but I think I may have found a limitation when it comes to 
>> > the commonField attribute in the DataImportHandler.
>> >
>> >
>> >
>> > Before writing my own parser to read in a whole XML document, I thought 
>> > I'd post the question here (since I got some great advice last time).
>> >
>> >
>> >
>> > The bulk of my content is contained within each  tag. However, each 
>> > item has a parent called  and each category has a name which I 
>> > would like to import. In my forEach loop I specify the 
>> > /document/category/item as the collection of items I am interested in. Is 
>> > there anyway to extract an element from underneath a parent node? To be a 
>> > more more specific (see eg xml below). I would like to index the following:
>> >
>> > - category: Category 1; id: 1; author: Author 1
>> >
>> > - category: Category 1; id: 2; author: Author 2
>> >
>> > - category: Category 2; id: 3; author: Author 3
>> >
>> > - category: Category 2; id: 4; author: Author 4
>> >
>> >
>> >
>> > Any ideas on how I can get to a parent node from within a child during 
>> > data import? If it cant be done, what do you suggest would be the best way 
>> > so I can keep using the DataImportHandler... would XSLT be a good idea to 
>> > 'flatten out' the structure a bit?
>> >
>> >
>> >
>> > Thanks
>> >
>> >
>> >
>> > This is what my XML document looks like:
>> >
>> > 
>> >  
>> >  Category 1
>> >  
>> >   1
>> >   Author 1
>> >  
>> >  
>> >   2
>> >   Author 2
>> >  
>> >  
>> >  
>> >  Category 2
>> >  
>> >   3
>> >   Author 3
>> >  
>> >  
>> >   4
>> >   Author 4
>> >  
>> >  
>> > 
>> >
>> >
>> >
>> > And this is what my dataConfig looks like:
>> > 
>> >  
>> >  
>> >   > > url="http://localhost:9080/data/20090817070752.xml"; 
>> > processor="XPathEntityProcessor" forEach="/document/category/item" 
>> > transformer="DateFormatTransformer" stream="true" dataSource="dataSource">
>> >> > commonField="true" />
>> >
>> >
>> >   
>> >  
>> > 
>> >
>> >
>> >
>> > This is how I have specified my schema
>> > 
>> >   > > required="true" />
>> >   
>> >   
>> > 
>> >
>> > id
>> > id
>> >
>> >
>> >
>> >
>> >
>> >
>> > _
>> > Need a place to rent, buy or share? Let us find your next place for you!
>> > http://clk.atdmt.com/NMN/go/157631292/direct/01/
>> 
>> 
>> 
>> -- 
>> -
>> Noble Paul | Principal Engineer| AOL | http://aol.com
>
>_
>Get Hotmail on your iPhone Find out how here
>http://windowslive.ninemsn.com.au/article.aspx?id=845706

-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


Solr http post performance seems slow - help?

2009-09-10 Thread Dan A. Dickey
I'm posting documents to Solr using http (curl) from
C++/C code and am seeing approximately 3.3 - 3.4
documents per second being posted.  Is this to be expected?
Granted - I understand that this depends somewhat on the
machine running Solr.  By the way - I'm running Solr inside JBoss.

I was hoping for maybe 20 or more docs/sec, and 3 or so
is quite a way from that.

Also, I'm posting just a single document at a time.  I once tried
5 processes each posting documents, and that slowed things
down considerably.  Down into the multiple (5-10) seconds per document.

Does anyone have suggestions on what I can try?  I'll soon
have better servers installed and will be splitting the indexing
work from the searching - but at this point in time, I wasn't doing
indexing while searching anyway.  Thanks for any and all help!
-Dan

-- 
Dan A. Dickey | Senior Software Engineer

Savvis
10900 Hampshire Ave. S., Bloomington, MN  55438
Office: 952.852.4803 | Fax: 952.852.4951
E-mail: dan.dic...@savvis.net


Re : Indexing fields dynamically

2009-09-10 Thread nourredine khadri
Thanks for the quick reply.

Ok for dynamicFields but  how can i rename fields during indexation/search to 
add suffix corresponding to the type ?  

What is the best way to do this?

Nourredine.





De : Yonik Seeley 
À : solr-user@lucene.apache.org
Envoyé le : Jeudi, 10 Septembre 2009, 14h24mn 26s
Objet : Re: Indexing fields dynamically

On Thu, Sep 10, 2009 at 5:58 AM, nourredine khadri
 wrote:
> I want to index my fields dynamically.
>
> DynamicFields don't suit my need because I don't know fields name in advance 
> and fields type must be set > dynamically too (need strong typage).

This is what dynamic fields are meant for - you pick both the name and
type (from a pre-defined set of types of course) at runtime.  The
suffix of the field name matches one of the dynamic fields and
essentially picks the type.

-Yonik
http://www.lucidimagination.com



  

Re: Extract info from parent node during data import

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
in my tests both seems to be working. I had misspelt the column as
"catgoryname" is that why?

keep in mind that you get extra docs for each "category" also



On Thu, Sep 10, 2009 at 5:53 PM, venn hardy  wrote:
>
> Hi Paul,
> The forEach="/document/category/item | /document/category/name" didn't work 
> (no categoryname was stored or indexed).
> However forEach="/document/category/item | /document/category" seems to work 
> well. I am not sure why category on its own works, but not category/name...
> But thanks for tip. It wasn't as painful as I thought it would be.
> Venn
>
>> From: noble.p...@corp.aol.com
>> Date: Thu, 10 Sep 2009 09:58:21 +0530
>> Subject: Re: Extract info from parent node during data import
>> To: solr-user@lucene.apache.org
>>
>> try this
>>
>> add two xpaths in your forEach
>>
>> forEach="/document/category/item | /document/category/name"
>>
>> and add a field as follows
>>
>> > commonField="true"/>
>>
>> Please try it out and let me know.
>>
>> On Thu, Sep 10, 2009 at 7:30 AM, venn hardy  wrote:
>> >
>> > Hello,
>> >
>> >
>> >
>> > I am using SOLR 1.4 (from nighly build) and its URLDataSource in 
>> > conjunction with the XPathEntityProcessor. I have successfully imported 
>> > XML content, but I think I may have found a limitation when it comes to 
>> > the commonField attribute in the DataImportHandler.
>> >
>> >
>> >
>> > Before writing my own parser to read in a whole XML document, I thought 
>> > I'd post the question here (since I got some great advice last time).
>> >
>> >
>> >
>> > The bulk of my content is contained within each  tag. However, each 
>> > item has a parent called  and each category has a name which I 
>> > would like to import. In my forEach loop I specify the 
>> > /document/category/item as the collection of items I am interested in. Is 
>> > there anyway to extract an element from underneath a parent node? To be a 
>> > more more specific (see eg xml below). I would like to index the following:
>> >
>> > - category: Category 1; id: 1; author: Author 1
>> >
>> > - category: Category 1; id: 2; author: Author 2
>> >
>> > - category: Category 2; id: 3; author: Author 3
>> >
>> > - category: Category 2; id: 4; author: Author 4
>> >
>> >
>> >
>> > Any ideas on how I can get to a parent node from within a child during 
>> > data import? If it cant be done, what do you suggest would be the best way 
>> > so I can keep using the DataImportHandler... would XSLT be a good idea to 
>> > 'flatten out' the structure a bit?
>> >
>> >
>> >
>> > Thanks
>> >
>> >
>> >
>> > This is what my XML document looks like:
>> >
>> > 
>> >  
>> >  Category 1
>> >  
>> >   1
>> >   Author 1
>> >  
>> >  
>> >   2
>> >   Author 2
>> >  
>> >  
>> >  
>> >  Category 2
>> >  
>> >   3
>> >   Author 3
>> >  
>> >  
>> >   4
>> >   Author 4
>> >  
>> >  
>> > 
>> >
>> >
>> >
>> > And this is what my dataConfig looks like:
>> > 
>> >  
>> >  
>> >   > > url="http://localhost:9080/data/20090817070752.xml"; 
>> > processor="XPathEntityProcessor" forEach="/document/category/item" 
>> > transformer="DateFormatTransformer" stream="true" dataSource="dataSource">
>> >    > > commonField="true" />
>> >    
>> >    
>> >   
>> >  
>> > 
>> >
>> >
>> >
>> > This is how I have specified my schema
>> > 
>> >   > > required="true" />
>> >   
>> >   
>> > 
>> >
>> > id
>> > id
>> >
>> >
>> >
>> >
>> >
>> >
>> > _
>> > Need a place to rent, buy or share? Let us find your next place for you!
>> > http://clk.atdmt.com/NMN/go/157631292/direct/01/
>>
>>
>>
>> --
>> -
>> Noble Paul | Principal Engineer| AOL | http://aol.com
>
> _
> Get Hotmail on your iPhone Find out how here
> http://windowslive.ninemsn.com.au/article.aspx?id=845706



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re : Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-10 Thread nourredine khadri
Some fields are null but not the one parsed by XPathEntityProcessor (named XML)

10 sept. 2009 14:40:34 org.apache.solr.handler.dataimport.LogTransformer 
transformRow
FIN: Map content : {KEYWORDS=pub, SPECIFIC=null, FATHERSID=, CONTAINERID=, 
ARCHIVEDDATE=0, SITE=12308, LANGUAGE=null, ARCHIVESTATE=false, OFFLINEATDATE=0, 
ONLINEATDATE=1026307864230, STATUS=0, DATESTATUS=1113905585726, MODEL=0, 
ACTIVATIONSTATE=true, MOUNTED_SITE_IDS=null, SPECIFIC_XML=null, 
PUBLICATIONSTATE=true, XML= 
 Empty 
Subtitle - Click Here to edit   Empty Title - 
Click Here to edit   Empty Chap¶ - Click Here 
to edit   Empty Autor - Click Here to edit 
  Empty Catchword - Click Here to edit   
Empty InterTitle - Cl
ick Here to edit TextEmpty Paragraph - Click Here to 
edit Text
, IDENTIFIERVERSION=5040052, CONTENTID=5040052}
10 sept. 2009 14:40:34 org.apache.solr.handler.dataimport.DocBuilder 
buildDocument
GRAVE: Exception while processing: xml_document document : 
SolrInputDocument[{keywords=keywords(1.0)={pub}, fathersId=fathersId(1.0)={}, 
containerId=containerId(1.0)={}, site=site(1.0)={12308}, 
archiveState=archiveState(1.0)={false}, offlineAtDate=offlineAtDate(1.0)={0}, 
onlineAtDate=onlineAtDate(1.0)={1026307864230}, status=status(1.0)={0}, 
dateStatus=dateStatus(1.0)={1113905585726}, model=model(1.0)={0}, 
activationState=activationState(1.0)={true}, 
publicationState=publicationState(1.0)={true}, xml=xml(1.0)={  Empty Subtitle - Click Here to 
edit   Empty Title - Click Here to 
edit   Empty Chap¶ - Click Here to edit<
/Parag>   Empty Autor - Click Here to edit   
Empty Catchword - Click Here to edit   
Empty InterTitle - Click Here to edit 
TextEmpty Paragraph - Click Here to edit 
Text
}, identifierversion=identifierversion(1.0)={5040052}, 
contentid=contentid(1.0)={5040052}}]
org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing failed 
for xml, url:null rows processed:0 Processing Document # 1
at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:292)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:187)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:164)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:365)
at 
org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:259)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159)
at 
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:354)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:395)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
Caused by: java.lang.RuntimeException: java.lang.NullPointerException
at 
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:92)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:282)
... 10 more
Caused by: java.lang.NullPointerException
at 
com.ctc.wstx.io.ReaderBootstrapper.initialLoad(ReaderBootstrapper.java:245)
at 
com.ctc.wstx.io.ReaderBootstrapper.bootstrapInput(ReaderBootstrapper.java:132)
at 
com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:543)
at 
com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:604)
at 
com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:660)
at 
com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:331)
at 
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:88)
... 11 more
10 sept. 2009 14:40:34 org.apache.solr.handler.dataimport.DataImporter 
doDeltaImport
GRAVE: Delta Import Failed
org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing failed 
for xml, url:null rows processed:0 Processing Document # 1
at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:292)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:187)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:164)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237)
at 
org.apache.solr.h

Re: Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
what do you see if you keep the logTemplate="${document}". I'm trying
to figure out the contents of the map


Re: How to Convert Lucene index files to XML Format

2009-09-10 Thread busbus

Thanks for your reply





> On Sep 10, 2009, at 6:41 AM, busbus wrote:
> Solr defers to Lucene on reading the index.  You just need to tell  
> Solr whether the index is a compound file or not and make sure the 
> versions are compatible.
> 

This part seems to be the point.
How to make solr to read lucene index files.
There is a tag in Solrconfig.xml
 false 

Enable it to true does not seem to be working.

What else need to be done.

Should i change the config file or add new tag.

Also how to check the compatibility of Lucen and solr

Thanks in advance

-- 
View this message in context: 
http://www.nabble.com/How-to-Convert-Lucene-index-files-to-XML-Format-tp25381017p25382367.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Indexing fields dynamically

2009-09-10 Thread Yonik Seeley
On Thu, Sep 10, 2009 at 5:58 AM, nourredine khadri
 wrote:
> I want to index my fields dynamically.
>
> DynamicFields don't suit my need because I don't know fields name in advance 
> and fields type must be set > dynamically too (need strong typage).

This is what dynamic fields are meant for - you pick both the name and
type (from a pre-defined set of types of course) at runtime.  The
suffix of the field name matches one of the dynamic fields and
essentially picks the type.

-Yonik
http://www.lucidimagination.com


Re: Solr: ERRORs at Startup

2009-09-10 Thread con

Hi Giovanni, 

i am facing same issue. Can you share some info on how you solved this
puzzle.. 





hossman wrote:
> 
> 
> : Even setting everything to INFO through
> : http://localhost:8080/solr/admin/logging didn't help.
> : 
> : But considering you do not see any bad issue here, at this time I will
> : ignore those ERROR messages :-)
> 
> i would read up more on how to configure logging in JBoss.
> 
> as far as i can tell, Solr is logging messages, which are getting handled 
> by a logger that writes them to STDERR using a fairly standard format 
> (date, class, method, level, msg) ... except some other piece of code 
> seems to be reading from STDERR, and assuming anything that got written 
> there is an ERROR, so it's loging those writes to stderr using a format 
> with a date, a level (of ERROR), and a group or some other identifier of 
> "STDERR"
> 
> the problem is if you ignore them completely, you're going to miss 
> noticing when you really have a problem.
> 
> Like i said: figure out how to configure logging in JBoss, you might need 
> to change the slf4j adapater jar or something if it can't deal with JUL 
> (which is the default).
> 
> : >> 10:51:20,525 INFO  [TomcatDeployment] deploy, ctxPath=/solr
> : >> 10:51:20,617 ERROR [STDERR] Mar 13, 2009 10:51:20 AM
> : >> org.apache.solr.servlet.SolrDispatchFilter init
> : >> INFO: SolrDispatchFilter.init()
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Solr%3A-ERRORs-at-Startup-tp22493300p25382340.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Extract info from parent node during data import

2009-09-10 Thread venn hardy

Hi Paul,
The forEach="/document/category/item | /document/category/name" didn't work (no 
categoryname was stored or indexed).
However forEach="/document/category/item | /document/category" seems to work 
well. I am not sure why category on its own works, but not category/name...
But thanks for tip. It wasn't as painful as I thought it would be.
Venn

> From: noble.p...@corp.aol.com
> Date: Thu, 10 Sep 2009 09:58:21 +0530
> Subject: Re: Extract info from parent node during data import
> To: solr-user@lucene.apache.org
> 
> try this
> 
> add two xpaths in your forEach
> 
> forEach="/document/category/item | /document/category/name"
> 
> and add a field as follows
> 
>  commonField="true"/>
> 
> Please try it out and let me know.
> 
> On Thu, Sep 10, 2009 at 7:30 AM, venn hardy  wrote:
> >
> > Hello,
> >
> >
> >
> > I am using SOLR 1.4 (from nighly build) and its URLDataSource in 
> > conjunction with the XPathEntityProcessor. I have successfully imported XML 
> > content, but I think I may have found a limitation when it comes to the 
> > commonField attribute in the DataImportHandler.
> >
> >
> >
> > Before writing my own parser to read in a whole XML document, I thought I'd 
> > post the question here (since I got some great advice last time).
> >
> >
> >
> > The bulk of my content is contained within each  tag. However, each 
> > item has a parent called  and each category has a name which I 
> > would like to import. In my forEach loop I specify the 
> > /document/category/item as the collection of items I am interested in. Is 
> > there anyway to extract an element from underneath a parent node? To be a 
> > more more specific (see eg xml below). I would like to index the following:
> >
> > - category: Category 1; id: 1; author: Author 1
> >
> > - category: Category 1; id: 2; author: Author 2
> >
> > - category: Category 2; id: 3; author: Author 3
> >
> > - category: Category 2; id: 4; author: Author 4
> >
> >
> >
> > Any ideas on how I can get to a parent node from within a child during data 
> > import? If it cant be done, what do you suggest would be the best way so I 
> > can keep using the DataImportHandler... would XSLT be a good idea to 
> > 'flatten out' the structure a bit?
> >
> >
> >
> > Thanks
> >
> >
> >
> > This is what my XML document looks like:
> >
> > 
> >  
> >  Category 1
> >  
> >   1
> >   Author 1
> >  
> >  
> >   2
> >   Author 2
> >  
> >  
> >  
> >  Category 2
> >  
> >   3
> >   Author 3
> >  
> >  
> >   4
> >   Author 4
> >  
> >  
> > 
> >
> >
> >
> > And this is what my dataConfig looks like:
> > 
> >  
> >  
> >> url="http://localhost:9080/data/20090817070752.xml"; 
> > processor="XPathEntityProcessor" forEach="/document/category/item" 
> > transformer="DateFormatTransformer" stream="true" dataSource="dataSource">
> > > commonField="true" />
> >
> >
> >   
> >  
> > 
> >
> >
> >
> > This is how I have specified my schema
> > 
> >> required="true" />
> >   
> >   
> > 
> >
> > id
> > id
> >
> >
> >
> >
> >
> >
> > _
> > Need a place to rent, buy or share? Let us find your next place for you!
> > http://clk.atdmt.com/NMN/go/157631292/direct/01/
> 
> 
> 
> -- 
> -
> Noble Paul | Principal Engineer| AOL | http://aol.com

_
Get Hotmail on your iPhone Find out how here
http://windowslive.ninemsn.com.au/article.aspx?id=845706

Re : Re : Pb using delta import with XPathEntityProcessor

2009-09-10 Thread nourredine khadri
That's the case. The field is not null.

10 sept. 2009 14:10:54 org.apache.solr.handler.dataimport.LogTransformer 
transformRow
FIN: id : 5040052 - Xml content :  
 Empty 
Subtitle - Click Here to edit   Empty Title - 
Click Here to edit   Empty Chap¶ - Click Here 
to edit   Empty Autor - Click Here to edit 
  Empty Catchword - Click Here to edit   
Empty InterTitle - Click Here to edit 
TextEmpty Paragraph - Click Here to edit 
Text

10 sept. 2009 14:10:54 org.apache.solr.handler.dataimport.DocBuilder 
buildDocument
GRAVE: Exception while processing: xml_document document : 
SolrInputDocument[{keywords=keywords(1.0)={pub}, fathersId=fathersId(1.0)={}, 
containerId=containerId(1.0)={}, site=site(1.0)={12308}, 
archiveState=archiveState(1.0)={false}, offlineAtDate=offlineAtDate(1.0)={0}, 
onlineAtDate=onlineAtDate(1.0)={1026307864230}, status=status(1.0)={0}, 
dateStatus=dateStatus(1.0)={1113905585726}, model=model(1.0)={0}, 
activationState=activationState(1.0)={true}, 
publicationState=publicationState(1.0)={true}, xml=xml(1.0)={  Empty Subtitle - Click Here to 
edit   Empty Title - Click Here to 
edit   Empty Chap¶ - Click Here to edit<
/Parag>   Empty Autor - Click Here to edit   
Empty Catchword - Click Here to edit   
Empty InterTitle - Click Here to edit 
TextEmpty Paragraph - Click Here to edit 
Text
}, identifierversion=identifierversion(1.0)={5040052}, 
contentid=contentid(1.0)={5040052}}]
org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing failed 
for xml, url:null rows processed:0 Processing Document # 1
at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:292)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:187)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:164)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:365)
at 
org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:259)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159)
at 
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:354)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:395)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
Caused by: java.lang.RuntimeException: java.lang.NullPointerException
at 
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:92)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:282)
... 10 more
Caused by: java.lang.NullPointerException
at 
com.ctc.wstx.io.ReaderBootstrapper.initialLoad(ReaderBootstrapper.java:245)
at 
com.ctc.wstx.io.ReaderBootstrapper.bootstrapInput(ReaderBootstrapper.java:132)
at 
com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:543)
at 
com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:604)
at 
com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:660)
at 
com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:331)
at 
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:88)
... 11 more
10 sept. 2009 14:10:54 org.apache.solr.handler.dataimport.DataImporter 
doDeltaImport
GRAVE: Delta Import Failed
org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing failed 
for xml, url:null rows processed:0 Processing Document # 1
at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:292)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:187)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:164)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:365)
at 
org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:259)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159)
at 
org.apache.solr.handler.dataimpor

Re: How to Convert Lucene index files to XML Format

2009-09-10 Thread Grant Ingersoll


On Sep 10, 2009, at 6:41 AM, busbus wrote:



Hello All,
I have a set of Files indexed by Lucene. Now i want to use the  
indexed files
in SOLR. The file .cfx an .cfs are not readable by Solr, as it  
supports only

.fds and .fdx.


Solr defers to Lucene on reading the index.  You just need to tell  
Solr whether the index is a compound file or not and make sure the  
versions are compatible.


What error are you getting?




So i decided to Add/update the index by just loading a XML File  
using the

post.jar funtion.

java -jar post.jar newFile.XML - Loads the XML and Updates the index.

Now i want to Convert all the cfx files to XML so that i can Use  
them in

SOLR.

Advice Needed.


I suppose you could walk the documents and dump them out to XML,  
assuming you have stored all your fields.


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: Connection refused when excecuting the query

2009-09-10 Thread Shalin Shekhar Mangar
On Thu, Sep 10, 2009 at 4:52 PM, dharhsana wrote:

>
> Hi to all,
> when i try to execute my query i get Connection refused ,can any one please
> tell me what should be done for this ,to make my solr run.
>
> org.apache.solr.client.solrj.SolrServerException:
> java.net.ConnectException:
> Connection refused: connect
> org.apache.solr.client.solrj.SolrServerException:
> java.net.ConnectException:
> Connection refused: connect
>at
>
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:471)
>at
>

Your Solr server is not running at the url you have given to
CommonsHttpSolrServer. Make sure you have given the correct url and Solr is
actually up and running at that url.

-- 
Regards,
Shalin Shekhar Mangar.


Connection refused when excecuting the query

2009-09-10 Thread dharhsana

Hi to all,
when i try to execute my query i get Connection refused ,can any one please
tell me what should be done for this ,to make my solr run.

org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException:
Connection refused: connect
org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException:
Connection refused: connect
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:471)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:242)
at
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
at
org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
at
com.cloud.seviceImpl.InsertToSolrServiceImpl.getMyBlogs(InsertToSolrServiceImpl.java:214)
at
com.cloud.struts.action.MyBlogAction.execute(MyBlogAction.java:42)
at
org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:425)
at
org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:228)
at
org.apache.struts.action.ActionServlet.process(ActionServlet.java:1913)
at
org.apache.struts.action.ActionServlet.doGet(ActionServlet.java:449)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:567)
at
org.apache.catalina.authenticator.SingleSignOn.invoke(SingleSignOn.java:394)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
at java.lang.Thread.run(Thread.java:595)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:519)
at java.net.Socket.connect(Socket.java:469)
at java.net.Socket.(Socket.java:366)
at java.net.Socket.(Socket.java:239)
at
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
at
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
at
org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361)
at
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:415)


with regrards,

rekha

-- 
View this message in context: 
http://www.nabble.com/Connection-refused-when-excecuting-the-query-tp25381486p25381486.html
Sent from the Solr - User mailing list archive at Nabble.com.



How to Convert Lucene index files to XML Format

2009-09-10 Thread busbus

Hello All,
I have a set of Files indexed by Lucene. Now i want to use the indexed files
in SOLR. The file .cfx an .cfs are not readable by Solr, as it supports only
.fds and .fdx.

So i decided to Add/update the index by just loading a XML File using the
post.jar funtion.

java -jar post.jar newFile.XML - Loads the XML and Updates the index.

Now i want to Convert all the cfx files to XML so that i can Use them in
SOLR.

Advice Needed.

Any other suggestions are most welcomed.

- Balaji
-- 
View this message in context: 
http://www.nabble.com/How-to-Convert-Lucene-index-files-to-XML-Format-tp25381017p25381017.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Re : Pb using delta import with XPathEntityProcessor

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
can your just confirm that the  field is not null byadding in  a
LogTransformer to the entity "document"

On Thu, Sep 10, 2009 at 3:54 PM, nourredine khadri
 wrote:
> But why that occurs only for delta import and not for the full ?
>
> I've checked my data : no xml field is null.
>
> Nourredine.
>
> Noble Paul wrote :
>>
>>I guess there was a null field and the xml parser blows up
>
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: Pb using delta import with XPathEntityProcessor

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
I just committed the fix https://issues.apache.org/jira/browse/SOLR-1420

But it does not solve your problem , it will just prevent the import
from throwing an exception and fail

2009/9/10 Noble Paul നോബിള്‍  नोब्ळ् :
> I guess there was a null field and the xml parser blows up
>
>
> On Thu, Sep 10, 2009 at 3:06 PM, nourredine khadri
>  wrote:
>> Hi,
>>
>> I'm new solR user and for the moment it suits almost all my needs :)
>>
>> I use a fresh nightly release (09/2009) and I index a
>> database table using dataImportHandler.
>>
>> I try to parse an xml content field from this table using 
>> XPathEntityProcessor
>> and FieldReaderDataSource. Everything works fine for the full-import.
>>
>> But when I try to use the delta import (i need incremental indexation) using 
>> "deltaQuery"
>> and "deltaImportQuery", it does not work and i have a stack for each
>> field :
>>
>> 10 sept. 2009 11:12:26
>> org.apache.solr.handler.dataimport.XPathEntityProcessor initQuery
>> ATTENTION: Parsing failed for xml, url:null rows processed:0
>> java.lang.RuntimeException: java.lang.NullPointerException
>>        at 
>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:92)
>>        at
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:282)
>>        at
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:187)
>>        at
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:164)
>>        at
>> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237)
>>        at
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
>>        at
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:365)
>>        at
>> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:259)
>>        at
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159)
>>        at
>> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:354)
>>        at
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:395)
>>        at
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
>> Caused by: java.lang.NullPointerException
>>        at
>> com.ctc.wstx.io.ReaderBootstrapper.initialLoad(ReaderBootstrapper.java:245)
>>        at
>> com.ctc.wstx.io.ReaderBootstrapper.bootstrapInput(ReaderBootstrapper.java:132)
>>        at
>> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:543)
>>        at
>> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:604)
>>        at
>> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:660)
>>        at
>> com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:331)
>>        at
>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:88)
>>        ... 11 more
>>
>>
>> When I remove the "delta" queries or the XPathEntityProcessor block , it's 
>> ok.
>>
>> my data-config.xml :
>>
>> 
>>  > name="database"
>>              type="JdbcDataSource"
>>              driver="com.mysql.jdbc.Driver"
>>              url="jdbc:mysql://xxx"
>>              user="xxx"
>>              password="xxx"/>
>>  > type="FieldReaderDataSource" name="fieldReader"/>
>>  
>>
>>   > name="document"
>>            dataSource="database"
>>            processor="SqlEntityProcessor"
>>            pk="CONTENTID"
>>            query="SELECT * FROM SEARCH"
>>            deltaImportQuery="SELECT * FROM SEARCH WHERE 
>> CONTENTID=${dataimporter.delta.CONTENTID}"
>>            deltaQuery="SELECT CONTENTID FROM SEARCH WHERE DATESTATUS
>>>=  UNIX_TIMESTAMP('${dataimporter.last_index_time}')">
>>
>>      > name="xml_contenu"
>>              dataSource="fieldReader"
>>              processor="XPathEntityProcessor"
>>              forEach="/Contenu"
>>              dataField="document.XML"
>>              onError="continue">
>>        > column="SurTitre" xpath="/Contenu/ArtCourt/SurTitre"
>> flatten="true"/>
>>        > column="Titre" xpath="/Contenu/ArtCourt/Titre"
>> flatten="true"/>
>>        > column="Chapeau" xpath="/Contenu/ArtCourt/Chapeau"
>> flatten="true"/>
>>        > column="Auteur" xpath="/Contenu/ArtCourt/AuteurW"
>> flatten="true"/>
>>        > column="Accroche" xpath="/Contenu/ArtCourt/Accroche"
>> flatten="true"/>
>>        > column="TxtCourt" xpath="/Contenu/ArtCourt/TxtCourt"
>> flatten="true"/>
>>        > column="Refs" xpath="/Contenu/ArtCourt/Refs"
>> flatten="true"/>
>>      
>>   
>>
>>  
>>
>> 
>>
>> the server query 
>> :http://localhost:8080/apache-solr-nightly/dataimport?command=delta-import
>>
>> All fields are declared in the shema.xml
>>
>> Can someone help me?
>>
>> Nourredine
>>
>>
>>
>
>
>
> --
> -
> Noble Paul | Principal

Re : Pb using delta import with XPathEntityProcessor

2009-09-10 Thread nourredine khadri
But why that occurs only for delta import and not for the full ?

I've checked my data : no xml field is null.

Nourredine.

Noble Paul wrote : 
>
>I guess there was a null field and the xml parser blows up


  

Re: Pb using delta import with XPathEntityProcessor

2009-09-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
I guess there was a null field and the xml parser blows up


On Thu, Sep 10, 2009 at 3:06 PM, nourredine khadri
 wrote:
> Hi,
>
> I'm new solR user and for the moment it suits almost all my needs :)
>
> I use a fresh nightly release (09/2009) and I index a
> database table using dataImportHandler.
>
> I try to parse an xml content field from this table using XPathEntityProcessor
> and FieldReaderDataSource. Everything works fine for the full-import.
>
> But when I try to use the delta import (i need incremental indexation) using 
> "deltaQuery"
> and "deltaImportQuery", it does not work and i have a stack for each
> field :
>
> 10 sept. 2009 11:12:26
> org.apache.solr.handler.dataimport.XPathEntityProcessor initQuery
> ATTENTION: Parsing failed for xml, url:null rows processed:0
> java.lang.RuntimeException: java.lang.NullPointerException
>        at 
> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:92)
>        at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:282)
>        at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:187)
>        at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:164)
>        at
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:365)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:259)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159)
>        at
> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:354)
>        at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:395)
>        at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
> Caused by: java.lang.NullPointerException
>        at
> com.ctc.wstx.io.ReaderBootstrapper.initialLoad(ReaderBootstrapper.java:245)
>        at
> com.ctc.wstx.io.ReaderBootstrapper.bootstrapInput(ReaderBootstrapper.java:132)
>        at
> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:543)
>        at
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:604)
>        at
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:660)
>        at
> com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:331)
>        at
> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:88)
>        ... 11 more
>
>
> When I remove the "delta" queries or the XPathEntityProcessor block , it's ok.
>
> my data-config.xml :
>
> 
>   name="database"
>              type="JdbcDataSource"
>              driver="com.mysql.jdbc.Driver"
>              url="jdbc:mysql://xxx"
>              user="xxx"
>              password="xxx"/>
>   type="FieldReaderDataSource" name="fieldReader"/>
>  
>
>    name="document"
>            dataSource="database"
>            processor="SqlEntityProcessor"
>            pk="CONTENTID"
>            query="SELECT * FROM SEARCH"
>            deltaImportQuery="SELECT * FROM SEARCH WHERE 
> CONTENTID=${dataimporter.delta.CONTENTID}"
>            deltaQuery="SELECT CONTENTID FROM SEARCH WHERE DATESTATUS
>>=  UNIX_TIMESTAMP('${dataimporter.last_index_time}')">
>
>       name="xml_contenu"
>              dataSource="fieldReader"
>              processor="XPathEntityProcessor"
>              forEach="/Contenu"
>              dataField="document.XML"
>              onError="continue">
>         column="SurTitre" xpath="/Contenu/ArtCourt/SurTitre"
> flatten="true"/>
>         column="Titre" xpath="/Contenu/ArtCourt/Titre"
> flatten="true"/>
>         column="Chapeau" xpath="/Contenu/ArtCourt/Chapeau"
> flatten="true"/>
>         column="Auteur" xpath="/Contenu/ArtCourt/AuteurW"
> flatten="true"/>
>         column="Accroche" xpath="/Contenu/ArtCourt/Accroche"
> flatten="true"/>
>         column="TxtCourt" xpath="/Contenu/ArtCourt/TxtCourt"
> flatten="true"/>
>         column="Refs" xpath="/Contenu/ArtCourt/Refs"
> flatten="true"/>
>      
>   
>
>  
>
> 
>
> the server query 
> :http://localhost:8080/apache-solr-nightly/dataimport?command=delta-import
>
> All fields are declared in the shema.xml
>
> Can someone help me?
>
> Nourredine
>
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Indexing fields dynamically

2009-09-10 Thread nourredine khadri
Hello,

I want to index my fields dynamically. 

DynamicFields don't suit my need because I don't know fields name in advance 
and fields type must be set dynamically too (need strong typage).

I think the solution is to handle this programmatically but what is the best 
way to do this? Which custom handler and api use ?

Nourredine.



  

Re: WebLogic 10 Compatibility Issue - StackOverflowError

2009-09-10 Thread Ilan Rabinovitch
In testing Solr 1.4 today with Weblogic it looks like the filters issue 
still exists.  Adding the appropriate entries in weblogic.xml still 
resolves it.


On first look, the header.jsp changes dont appear to be required anymore.

Would it make sense to include a weblogic.xml in the distribution to 
disable the filters or should this be an exercise for 
users/administrators who chose to deploy this under weblogic?



On 2/3/09 10:26 PM, Ilan Rabinovitch wrote:

We believe that the filters/forward issue is likely something specific
to weblogic. Specifically that other containers have filters disabled on
forward by default, where as weblogic has them enabled.


We dont think the small modification we had to make to headers.jsp are
weblogic specific.





On 1/30/09 8:15 AM, Feak, Todd wrote:

Are the issues ran into due to non-standard code in Solr, or is there
some WebLogic inconsistency?

-Todd Feak

-Original Message-
From: news [mailto:n...@ger.gmane.org] On Behalf Of Ilan Rabinovitch
Sent: Friday, January 30, 2009 1:11 AM
To: solr-user@lucene.apache.org
Subject: Re: WebLogic 10 Compatibility Issue - StackOverflowError

I created a wiki page shortly after posting to the list:

http://wiki.apache.org/solr/SolrWeblogic

From what we could tell Solr itself was fully functional, it was only
the admin tools that were failing.

Regards,
Ilan Rabinovitch

---
SCALE 7x: 2009 Southern California Linux Expo
Los Angeles, CA
http://www.socallinuxexpo.org


On 1/29/09 4:34 AM, Mark Miller wrote:

We should get this on the wiki.

- Mark


Ilan Rabinovitch wrote:

We were able to deploy Solr 1.3 on Weblogic 10.0 earlier today. Doing
so required two changes:

1) Creating a weblogic.xml file in solr.war's WEB-INF directory. The
weblogic.xml file is required to disable Solr's filter on FORWARD.

The contents of weblogic.xml should be:


http://www.bea.com/ns/weblogic/90";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xsi:schemaLocation="http://www.bea.com/ns/weblogic/90
http://www.bea.com/ns/weblogic/90/weblogic-web-app.xsd";>





false






2) Remove the pageEncoding attribute from line 1 of

solr/admin/header.jsp




On 1/17/09 2:02 PM, KSY wrote:

I hit a major roadblock while trying to get Solr 1.3 running on

WebLogic

10.0.

A similar message was posted before - (


http://www.nabble.com/Solr-1.3-stack-overflow-when-accessing-solr-admin-
page-td20157873.html



http://www.nabble.com/Solr-1.3-stack-overflow-when-accessing-solr-admin-
page-td20157873.html

) - but it seems like it hasn't been resolved yet, so I'm re-posting
here.

I am sure I configured everything correctly because it's working

fine on

Resin.

Has anyone successfully run Solr 1.3 on WebLogic 10.0 or higher?

Thanks.


SUMMARY:

When accessing /solr/admin page, StackOverflowError occurs due to an
infinite recursion in SolrDispatchFilter


ENVIRONMENT SETTING:

Solr 1.3.0
WebLogic 10.0
JRockit JVM 1.5


ERROR MESSAGE:

SEVERE: javax.servlet.ServletException: java.lang.StackOverflowError
at


weblogic.servlet.internal.RequestDispatcherImpl.forward(RequestDispatche
rImpl.java:276)

at


org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
ava:273)

at


weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:
42)

at


weblogic.servlet.internal.RequestDispatcherImpl.invokeServlet(RequestDis
patcherImpl.java:526)

at


weblogic.servlet.internal.RequestDispatcherImpl.forward(RequestDispatche
rImpl.java:261)

at


org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
ava:273)

at


weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:
42)

at


weblogic.servlet.internal.RequestDispatcherImpl.invokeServlet(RequestDis
patcherImpl.java:526)

at


weblogic.servlet.internal.RequestDispatcherImpl.forward(RequestDispatche
rImpl.java:261)

at


org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
ava:273)

at


weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:
42)

at


weblogic.servlet.internal.RequestDispatcherImpl.invokeServlet(RequestDis
patcherImpl.java:526)

at


weblogic.servlet.internal.RequestDispatcherImpl.forward(RequestDispatche
rImpl.java:261)

at


org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
ava:273)


















--
Ilan Rabinovitch
i...@fonz.net

---
SCALE 8x: 2010 Southern California Linux Expo
Feb 19-21, 2010
Los Angeles, CA
http://www.socallinuxexpo.org



Pb using delta import with XPathEntityProcessor

2009-09-10 Thread nourredine khadri
Hi, 
 
I'm new solR user and for the moment it suits almost all my needs :)
 
I use a fresh nightly release (09/2009) and I index a
database table using dataImportHandler.
 
I try to parse an xml content field from this table using XPathEntityProcessor
and FieldReaderDataSource. Everything works fine for the full-import.
 
But when I try to use the delta import (i need incremental indexation) using 
"deltaQuery"
and "deltaImportQuery", it does not work and i have a stack for each
field : 
 
10 sept. 2009 11:12:26
org.apache.solr.handler.dataimport.XPathEntityProcessor initQuery
ATTENTION: Parsing failed for xml, url:null rows processed:0
java.lang.RuntimeException: java.lang.NullPointerException
at 
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:92)
at
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:282)
at
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:187)
at
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:164)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:365)
at
org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:259)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159)
at
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:354)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:395)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
Caused by: java.lang.NullPointerException
at
com.ctc.wstx.io.ReaderBootstrapper.initialLoad(ReaderBootstrapper.java:245)
at
com.ctc.wstx.io.ReaderBootstrapper.bootstrapInput(ReaderBootstrapper.java:132)
at
com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:543)
at
com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:604)
at
com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:660)
at
com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:331)
at
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:88)
... 11 more
 
 
When I remove the "delta" queries or the XPathEntityProcessor block , it's ok.
 
my data-config.xml : 
 

  
  
  

   

  







  
   
   
  
  

 
the server query 
:http://localhost:8080/apache-solr-nightly/dataimport?command=delta-import
 
All fields are declared in the shema.xml
 
Can someone help me?
 
Nourredine


  

Re: Misleading log messages while deploying solr

2009-09-10 Thread con

Thanks Hossman

As per my understandings and investigations, if we disable STDERR from the
jboss configs, we will not be able to see any STDERR coming from any of the
APIs - which can be real error messages. 
So if we know the exact reason why this message from solr is showing up, we
can block this at solr level or may be jboss level. 

Any suggestion which points out a reason for this or a solution that hides
these messages only is really appreciable.


thanks



hossman wrote:
> 
> 
> : But the log message that is getting print in the server console, in my
> case
> : jboss, is showing status as error.
> : Why is this showing as ERROR, even though things are working fine.
> 
> Solr is not declaring that those messages are ERRORs, solr is just logging 
> informational messages (hence then "INFO" lines) using the java logging 
> framework.
> 
> My guess: since the logs are getting prefixed with "ERROR [STDERR]" 
> something about the way your jboss container is configured is probably 
> causing those log messages to be written to STDERR, and then jboss is 
> capturing the STDERR and assuming that if it went there it mist be an 
> "ERROR" of some kind and logging it to the console (using it's own log 
> format, hence the touble timestamps per line message)
> 
> In short: jboss is doing this in response to normal logging from solr.  
> you should investigate your options for configuriring jboss and how it 
> deals with log messages from applications.
> 
> 
> : 11:41:19,030 INFO  [TomcatDeployer] deploy, ctxPath=/solr,
> : warUrl=.../tmp/deploy/tmp43266solr-exp.war/
> : 11:41:19,948 ERROR [STDERR] 8 Sep, 2009 11:41:19 AM
> : org.apache.solr.servlet.SolrDispatchFilter init
> : INFO: SolrDispatchFilter.init()
> : 11:41:19,975 ERROR [STDERR] 8 Sep, 2009 11:41:19 AM
> : org.apache.solr.core.SolrResourceLoader locateInstanceDir
> : INFO: No /solr/home in JNDI
> : 11:41:19,976 ERROR [STDERR] 8 Sep, 2009 11:41:19 AM
> : org.apache.solr.core.SolrResourceLoader locateInstanceDir
> : INFO: using system property solr.solr.home: C:\app\Search
> : 11:41:19,984 ERROR [STDERR] 8 Sep, 2009 11:41:19 AM
> : org.apache.solr.core.CoreContainer$Initializer initialize
> : INFO: looking for solr.xml: C:\app\Search\solr.xml
> : 11:41:20,084 ERROR [STDERR] 8 Sep, 2009 11:41:20 AM
> : org.apache.solr.core.SolrResourceLoader 
> : INFO: Solr home set to 'C:\app\Search' 
> : 11:41:20,142 ERROR [STDERR] 8 Sep, 2009 11:41:20 AM
> : org.apache.solr.core.SolrResourceLoader createClassLoader
> : INFO: Adding
> : 'file:/C:/app/Search/lib/apache-solr-dataimporthandler-1.3.0.jar' to
> Solr
> : classloader
> : 11:41:20,144 ERROR [STDERR] 8 Sep, 2009 11:41:20 AM
> : org.apache.solr.core.SolrResourceLoader createClassLoader
> : INFO: Adding 'file:/C:/app/Search/lib/jsp-2.1/' to Solr classloader
> : 
> : ...
> : INFO: Reusing parent classloader
> : 11:41:21,870 ERROR [STDERR] 8 Sep, 2009 11:41:21 AM
> : org.apache.solr.core.SolrConfig 
> : INFO: Loaded SolrConfig: solrconfig.xml
> : 11:41:21,909 ERROR [STDERR] 8 Sep, 2009 11:41:21 AM
> : org.apache.solr.schema.IndexSchema readSchema
> : INFO: Reading Solr Schema
> : 11:41:22,092 ERROR [STDERR] 8 Sep, 2009 11:41:22 AM
> : org.apache.solr.schema.IndexSchema readSchema
> : INFO: Schema name=contacts schema
> : 11:41:22,121 ERROR [STDERR] 8 Sep, 2009 11:41:22 AM
> : org.apache.solr.util.plugin.AbstractPluginLoader load
> : INFO: created string: org.apache.solr.schema.StrField
> : 
> : .
> : -- 
> : View this message in context:
> http://www.nabble.com/Misleading-log-messages-while-deploying-solr-tp25354654p25354654.html
> : Sent from the Solr - User mailing list archive at Nabble.com.
> : 
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Misleading-log-messages-while-deploying-solr-tp25354654p25379937.html
Sent from the Solr - User mailing list archive at Nabble.com.



Does MoreLikeThis support sharding?

2009-09-10 Thread jlist9
Hi,

I tried MoreLikeThis (StandardRequestHandler with mlt arguments)
with a single solr server and it works fine. However, when I tried
the same query with sharded servers, I don't get the moreLikeThis
key in the results.

So my question is, Is MoreLikeThis with StandardRequestHandler
supported on shards? If not, is MoreLikeThisHandler supported?

Thanks,
Jack





Re: Solr fitting in travel site context?

2009-09-10 Thread Constantijn Visinescu
I'd run look into faceting and run a test.

Create a schema, index the data and then run a query for *:* facteted by
hotel to get a list of all the hotels you want followed by a query that
returns all documents matching that hotel for your 2nd usecase.

You're probably still going to want a SQL database to catch the reservations
made tho.

in my experience implementing Solr is more work then implementing a normal
SQL database, and loosing the relational part of a relational database is
something you have to wrap your head around to see how it affects your
application.

That said solr on my 4 year old single core laptop outperforms our new dual
xeon database server running IBM DB2 when it comes to running a query on a
10 million record dataset and retuning the total amount of documents that
match.

Once you get it up and running properly and you need querys that are like
"give me the total number of documents that match these criteria, optionally
facted by this and that" it's amazingly fast.

Note that this advantage only becomes apparent when dealing with large data
sets. anything under a coulpe 100k records (guideline, depends heavily on
the type of record) and a normal SQL server should also be able to give you
the results you need near instantly.

Hope this helps ;)


On Wed, Sep 9, 2009 at 5:33 PM, Carsten Kraus wrote:

> Hi all,
>
> I'm about to develop a travel website and am wondering if Solr might fit to
> be used as the search solution.
> Being quite the opposite of a db guru and new to Solr, it's hard for me to
> judge if for my use-case a relational db should be used in favor of Solr(or
> similar indexing server). Maybe some of you guys would share their opinion
> on this?
>
> The products being searched for would be travel packages. That is: hotel
> room + flight combined into one product.
> I receive the products via a csv file, where each line defines a travel
> package with concrete departure/return, accommodation and price data.
>
> For example one csv row might represent:
> Hotel Foo in Paris, flight departing 10/10/09 from London, ending 10/20/09,
> mealplan Bar, pricing $300
> ..while another one might look like:
> Hotel Foo in Paris, flight departing 10/10/09 from Amsterdam, ending
> 10/30/09, mealplan Eggs :), pricing $400
>
> Now searches should show results in 2 steps: first step showing results
> grouped by hotel(so no hotel appears twice) and second one all
> date-airport-mealplan combinations for the hotel selected by the user in
> step 1.
>
> From some first little tests, it seems to me as if I at least would need
> the
> collapse patch(SOLR-236) to be used in step 1 above?!
>
> What do you think? Does Solr fit into this scenario? Thoughts?
>
> Sorry for the lengthy post & thanks a lot for any pointer!
> Carsten
>