date:20130111

Re: retrieving latest document only

2013-01-11 Thread Upayavira

Not sure exactly what you mean, can you give an example? Upayavira On Sat, Jan 12, 2013, at 06:32 AM, J Mohamed Zahoor wrote: > Cool… it worked… But the count of all the groups and the count inside > stats component does not match… > Is that a bug? > > ./zahoor > > > On 11-Jan-2013, at 6:48 PM

Re: retrieving latest document only

2013-01-11 Thread J Mohamed Zahoor

Cool… it worked… But the count of all the groups and the count inside stats component does not match… Is that a bug? ./zahoor On 11-Jan-2013, at 6:48 PM, Upayavira wrote: > could you use field collapsing? Boost by date and only show one value > per group, and you'll have the most recent docum

RE: how to perform a delta-import when related table is updated

2013-01-11 Thread PeterKerk

Awesome! This one line did the trick: http://lucene.472066.n3.nabble.com/how-to-perform-a-delta-import-when-related-table-is-updated-tp4032587p4032671.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrJ |ContentStreamUpdateRequest | Accessing parsed items without committing to solr

2013-01-11 Thread Erik Hatcher

It's an ExtractingRequestHandler parameter (see the wiki). Not quite sure the Java incantation to set that but definitely possible. Erik On Jan 11, 2013, at 17:14, uwe72 wrote: > Erik, what do u mean with this parameter, i don't find it.. > > > > -- > View this message in context: >

Re: SolrJ |ContentStreamUpdateRequest | Accessing parsed items without committing to solr

2013-01-11 Thread uwe72

Erik, what do u mean with this parameter, i don't find it.. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-ContentStreamUpdateRequest-Accessing-parsed-items-without-committing-to-solr-tp4032636p4032656.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrJ |ContentStreamUpdateRequest | Accessing parsed items without committing to solr

2013-01-11 Thread Erik Hatcher

Look at the extractOnly parameter. But doing this in your client is the more recommended way of doing this to keep Solr from getting beat up too bad. Erik On Jan 11, 2013, at 15:55, uwe72 wrote: > i have a bit strange usecase. > > when i index a pdf to solr i use ContentStreamUpdateReq

Re: Accessing raw index data

2013-01-11 Thread Shawn Heisey

On 1/11/2013 1:33 PM, Achim Domma wrote: "At the base, Solr indexes are Lucene indexes, so one can always drop down to that level." That's what I'm looking for. I understand, that at the end, there has to be an inverse index (or rather multiple of them), holding all "words" which occurre in m

Re: SolrJ |ContentStreamUpdateRequest | Accessing parsed items without committing to solr

2013-01-11 Thread uwe72

ok, seems this works: Tika tika = new Tika(); String tokens = tika.parseToString(file); -- View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-ContentStreamUpdateRequest-Accessing-parsed-items-without-committing-to-solr-tp4032636p4032649.html Sent from the Sol

Re: SolrJ |ContentStreamUpdateRequest | Accessing parsed items without committing to solr

2013-01-11 Thread uwe72

Yes, i don't really want to index/store the pdf document in lucene. i just need the parsed tokens for other things. So you mean i can use ExtractingRequestHandler.java to retrieve the items. has anybody a piece of code, doing that? actually i give the pdf as input and want the parsed items (the

Re: SolrJ |ContentStreamUpdateRequest | Accessing parsed items without committing to solr

2013-01-11 Thread Alexandre Rafalovitch

If I understand it, you are sending the file to Solr which then uses Tika library to do the preprocessing/extraction and stores the results in the defined fields . If you don't want Solr to do the storing and want to change extracted fields, just use the Tika library in your client and work with r

Re: SloppyPhraseScorer behavior change

2013-01-11 Thread varun srivastava

Moreover just checked .. autoGeneratePhraseQueries="true" is set for both 3.4 and 4.0 in my schema. Thanks Varun On Fri, Jan 11, 2013 at 1:04 PM, varun srivastava wrote: > Hi Jack, > Is this a new change done in solr 4.0 ? Seems autoGeneratePhraseQueries > option is present from solr 3.1. Just

Re: SloppyPhraseScorer behavior change

2013-01-11 Thread varun srivastava

Hi Jack, Is this a new change done in solr 4.0 ? Seems autoGeneratePhraseQueries option is present from solr 3.1. Just wanted to confirm this is the difference causing change in behavior between 3.4 and 4.0. Thanks Varun On Mon, Dec 24, 2012 at 3:00 PM, Jack Krupansky wrote: > Thanks. Sloppy p

SolrJ |ContentStreamUpdateRequest | Accessing parsed items without committing to solr

2013-01-11 Thread uwe72

i have a bit strange usecase. when i index a pdf to solr i use ContentStreamUpdateRequest. The lucene document then contains in the "text" field all containing items (the parsed items of the physical pdf). i also need to add these parsed items to another lucene document. is there a way, to recei

RE: how to perform a delta-import when related table is updated

2013-01-11 Thread Dyer, James

Try adding the "pk" attribute to the parent entity in any of these 4 ways: mailto:vettepa...@hotmail.com] Sent: Friday, January 11, 2013 1:18 PM To: solr-user@lucene.apache.org Subject: RE: how to perform a delta-import when related table is updated Hi James, Ok, so I did this:

Re: Accessing raw index data

2013-01-11 Thread Alexandre Rafalovitch

Have you looked at Solr admin interface in details? Specifically, analysis section under each core. It provides some of the statistics you seem to want. And, gives you the source code to look at to understand how to create your own version of that. Specifically, the "Luke" package is what you might

Re: Accessing raw index data

2013-01-11 Thread Gora Mohanty

On 12 January 2013 02:03, Achim Domma wrote: > "At the base, Solr indexes are Lucene indexes, so one can always > drop down to that level." > > That's what I'm looking for. I understand, that at the end, there has to be > an inverse index (or rather multiple of them), holding all "words" which

Re: Accessing raw index data

2013-01-11 Thread Achim Domma

"At the base, Solr indexes are Lucene indexes, so one can always drop down to that level." That's what I'm looking for. I understand, that at the end, there has to be an inverse index (or rather multiple of them), holding all "words" which occurre in my documents, each "word" having a list of d

Re: Accessing raw index data

2013-01-11 Thread Gora Mohanty

On 12 January 2013 01:06, Achim Domma wrote: > > Hi, > > I have just setup my first Solr 4.0 instance and have added about one > million documents. I would like to access the raw data stored in the index. > Can somebody give me a starting point how to do that? > > As a first step, a simple dump wo

Accessing raw index data

2013-01-11 Thread Achim Domma

Hi, I have just setup my first Solr 4.0 instance and have added about one million documents. I would like to access the raw data stored in the index. Can somebody give me a starting point how to do that? As a first step, a simple dump would be absolutely ok. I just want to play around and do s

RE: how to perform a delta-import when related table is updated

2013-01-11 Thread PeterKerk

Hi James, Ok, so I did this: I now get this error in the logfile: SEVERE: Delta Import Failed java.lang.IllegalArgumentException: deltaQuery has no column to resolve to declared primary key pk='ID'

RE: how to perform a delta-import when related table is updated

2013-01-11 Thread Dyer, James

Peter, See http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command , then scroll down to where it says "The deltaQuery in the above example only detects changes in item but not in other tables..." It shows you two ways to do it. Option 1: add a reference to the last_modified

Re: Setting up new SolrCloud - need some guidance

2013-01-11 Thread Shawn Heisey

On 1/11/2013 9:15 AM, Markus Jelsma wrote: FYI: XInclude works fine. We have all request handlers in solrconfig in separate files and include them via XInclude on a running SolrCloud cluster. Good to know. I'm still deciding whether I want to recombine or continue to use xinclude. Is the xi

how to perform a delta-import when related table is updated

2013-01-11 Thread PeterKerk

My delta-import (http://localhost:8983/solr/freemedia/dataimport?command=delta-import) does not correctly update my solr fields. Please see my data-config here: Now when a new item is inserted into [freemedialikes] an

Re: Solr 4.0, slow opening searchers

2013-01-11 Thread Alan Woodward

Hi Marcel, Are you committing data with hard commits or soft commits? I've seen systems where we've inadvertently only used soft commits, which means that the entire transaction log has to be re-read on startup, which can take a long time. Hard commits flush indexed data to disk, and make it

Re: configuring schema to match database

2013-01-11 Thread Gora Mohanty

On 11 January 2013 22:30, Jens Grivolla wrote: [...] > Actually, that is what you would get when doing a join in an RDBMS, the > cross-product of your tables. This is NOT AT ALL what you typically do in > Solr. > > Best start the other way around, think of Solr as a retrieval system, not a > st

Re: configuring schema to match database

2013-01-11 Thread Jens Grivolla

On 01/11/2013 05:23 PM, Gora Mohanty wrote: You are still thinking of Solr as a RDBMS, where you should not be. In your case, it is easiest to flatten out the data. This increases the size of the index, but that should not really be of concern. As your courses and languages tables are connected o

Re: link on graph page

2013-01-11 Thread Mark Miller

They point to the admin UI - or should - that seems right? - Mark On Jan 11, 2013, at 10:57 AM, Christopher Gross wrote: > I've managed to get my SolrCloud set up to have 2 different indexes up and > running. However, my URLs aren't right. They just point to > http://server:port/solr, not htt

RE: Forwarding authentication credentials in internal node-to-node requests

2013-01-11 Thread Markus Jelsma

Hmm, you need to set up the HttpClient in HttpShardHandlerFactory but you cannot access the HttpServletRequest from there, it is only available in SolrDispatchFilter AFAIK. And then, the HttpServletRequest can only return the remote user name, not the password he, she or it provided. I don't kno

How to disable\clear filterCache(from SolrIndexSearcher ) in a custom searchComponent

2013-01-11 Thread radu

Hello & thank you in advance for your help!, *Context:* I have implemented a custom search component that receives 3 parameters field, termValue and payloadX. The component should search for a termValue in the requested Lucene field and for each *termValue* to check *payloadX* in its associated

Re: configuring schema to match database

2013-01-11 Thread Gora Mohanty

On 11 January 2013 21:13, Niklas Langvig wrote: > It sounds good not to use more than one core, for sure I do not want to over > complicate this. [...] Yes, not only are multiple cores unnecessarily complicated here, your searches will also be be less complex, and faster. > Both table courses a

Re: Getting Files into Zookeeper

2013-01-11 Thread Christopher Gross

I changed it to only go to one Zookeeper (localhost:2181) and it still gave me the same stack trace error. I was eventually able to get around this -- I just used the "bootstrap" arguments when starting up my Tomcat instances to push the configs over -- though I'd rather just do it externally from

RE: Setting up new SolrCloud - need some guidance

2013-01-11 Thread Markus Jelsma

FYI: XInclude works fine. We have all request handlers in solrconfig in separate files and include them via XInclude on a running SolrCloud cluster. -Original message- > From:Mark Miller > Sent: Fri 11-Jan-2013 17:13 > To: solr-user@lucene.apache.org > Subject: Re: Setting up new SolrC

Re: Setting up new SolrCloud - need some guidance

2013-01-11 Thread Mark Miller

On Jan 10, 2013, at 12:06 PM, Shawn Heisey wrote: > On 1/9/2013 8:54 PM, Mark Miller wrote: >> I'd put everything into one. You can upload different named sets of config >> files and point collections either to the same sets or different sets. >> >> You can really think about it the same way y

Re: Getting Files into Zookeeper

2013-01-11 Thread Mark Miller

It's a bug that you only see RuntimeException - in 4.1 you will get the real problem - which is likely around connecting to zookeeper. You might try with a single zk host in the zk host string initially. That might make it easier to track down why it won't connect. It's tough to diagnose because

SV: configuring schema to match database

2013-01-11 Thread Niklas Langvig

It sounds good not to use more than one core, for sure I do not want to over complicate this. Yes I meant tables. It's pretty simple. Both table courses and languages has it's own primary key courseseqno and languagesseqno Both also have a foreign key "userid" that references the users table wi

Re: configuring schema to match database

2013-01-11 Thread Gora Mohanty

On 11 January 2013 19:57, Niklas Langvig wrote: > Ahh sorry, > Now I understand, > Ok seems like a good solution, I just know need to understand how to query > multiple cores now :) There is no need to use multiple cores in your setup. Going back to your original problem statement, it can easily

Re: configuring schema to match database

2013-01-11 Thread Dariusz Borowski

I don't know how to query multiple cores and if it's possible at once, but otherwise I would create a JOIN sql script if you need values from multiple tables. D. On Fri, Jan 11, 2013 at 3:27 PM, Niklas Langvig < niklas.lang...@globesoft.com> wrote: > Ahh sorry, > Now I understand, > Ok seems l

SV: configuring schema to match database

2013-01-11 Thread Niklas Langvig

Ahh sorry, Now I understand, Ok seems like a good solution, I just know need to understand how to query multiple cores now :) -Ursprungligt meddelande- Från: Dariusz Borowski [mailto:darius...@gmail.com] Skickat: den 11 januari 2013 15:15 Till: solr-user@lucene.apache.org Ämne: Re: confi

Re: Reading properties in data-import.xml

2013-01-11 Thread Dariusz Borowski

Thanks Alex! This brought me to the solution I wanted to achieve. :) D. On Thu, Jan 10, 2013 at 3:21 PM, Alexandre Rafalovitch wrote: > dataimport.properties is for DIH to store it's own properties for delta > processing and things. Try solrcore.properties instead, as per recent > discussion:

Re: Forwarding authentication credentials in internal node-to-node requests

2013-01-11 Thread Per Steffensen

Hmmm, it will not work for me. I want the "original" credential forwarded in the sub-requests. The credentials are mapped to permissions (authorization), and basically I dont want a user to be able have something done in the (automatically performed by the contacted solr-node) sub-requests that

Re: configuring schema to match database

2013-01-11 Thread Dariusz Borowski

Hi, No, it has actually two tables. User and Item. The example shown on the blog is for one table, because you repeat the same thing for the other table. Only your data-import.xml file changes. For the rest, just copy and paste it in the conf directory. If you are running your solr in Linux, then

SV: configuring schema to match database

2013-01-11 Thread Niklas Langvig

Hi Dariusz, To me this example has one table "user" and I have many tables that connects to one user and that is what I'm unsure how how to do. /Niklas -Ursprungligt meddelande- Från: Dariusz Borowski [mailto:darius...@gmail.com] Skickat: den 11 januari 2013 14:56 Till: solr-user@luce

SV: configuring schema to match database

2013-01-11 Thread Niklas Langvig

Hmm noticed I wrote I have 3 columns: users, courses and languages I ofcourse mean I have 3 tables: users, courses and languages /Niklas -Ursprungligt meddelande- Från: Niklas Langvig [mailto:niklas.lang...@globesoft.com] Skickat: den 11 januari 2013 14:19 Till: solr-user@lucene.apache.o

SV: configuring schema to match database

2013-01-11 Thread Niklas Langvig

When thinkting some more, Perhaps I could have coursename and such as multivalue? Or should I have separate indeces for users, courses and languages? I get the feeling both would work, but now sure which way is the best to go. When a user is updating/removing/adding a course it would be nice to

Re: configuring schema to match database

2013-01-11 Thread Dariusz Borowski

Hi Niklas, Maybe this link helps: http://www.coderthing.com/solr-with-multicore-and-database-hook-part-1/ D. On Fri, Jan 11, 2013 at 2:19 PM, Niklas Langvig < niklas.lang...@globesoft.com> wrote: > Hi! > I'm quite new to solr and trying to understand how to create a schema from > how our pos

Re: Index data from multiple tables into Solr

2013-01-11 Thread Dariusz Borowski

Hi! I know the pain! ;) That's why I wrote a bit on a blog, so I could remember in the future. Here is the link in case you would like to read a tutorial how to setup SOLR w/ multicore and hook it up to the database: http://www.coderthing.com/solr-with-multicore-and-database-hook-part-1/ I hope

Solr 4.0, slow opening searchers

2013-01-11 Thread Marcel Bremer

Hi, We're experiencing slow startup times of searchers in Solr when containing a large number of documents. We use Solr v4.0 with Jetty and currently have 267.657.634 documents stored, spread across 9 cores. These documents contain keywords, with additional statistics, which we are using for s

Re: SolrCloud removing shard (how to not loose data)

2013-01-11 Thread mizayah

Seams I'm to lazy. I found this http://wiki.apache.org/solr/MergingSolrIndexes, and it works rly. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-removing-shard-how-to-not-loose-data-tp4032138p4032508.html Sent from the Solr - User mailing list archive at Nabble.co

configuring schema to match database

2013-01-11 Thread Niklas Langvig

Hi! I'm quite new to solr and trying to understand how to create a schema from how our postgres database and then search for the content in solr instead of querying the db. My question should be really easy, it has most likely been asked many times but still I'm not able to google any answer to

Re: retrieving latest document only

2013-01-11 Thread Upayavira

could you use field collapsing? Boost by date and only show one value per group, and you'll have the most recent document only. Upayavira On Fri, Jan 11, 2013, at 01:10 PM, jmozah wrote: > one crude way is first query and pick the latest date from the result > then issue a query with q=timestamp[

Re: retrieving latest document only

2013-01-11 Thread jmozah

one crude way is first query and pick the latest date from the result then issue a query with q=timestamp[latestDate TO latestDate] But i dont want to execute two queries... ./zahoor On 11-Jan-2013, at 6:37 PM, jmozah wrote: > > > >> What do you want? >> 'the most recent ones' or '**only**

Re: retrieving latest document only

2013-01-11 Thread jmozah

> What do you want? > 'the most recent ones' or '**only** the latest' ? > > Perhaps a range query "q=timestamp:[refdate TO NOW]" will match your needs. > > Uwe > I need **only** the latest documents... in the above query , "refdate" can vary based on the query. ./zahoor

RE: Forwarding authentication credentials in internal node-to-node requests

2013-01-11 Thread Markus Jelsma

Hi, If your credentials are fixed i would configure username:password in your request handler's shardHandlerFactory configuration section and then modify HttpShardHandlerFactory.init() to create a HttpClient with an AuthScope configured with those settings. I don't think you can obtain the ori

which way for "export"

2013-01-11 Thread stockii

hello. Which is the best/fastest way to get the value of many fields from index? My problem is, that i need to calculate a sum of amounts. this amount is in my index (stored="true"). my php script get all values with paging. but if a request takes too long, jetty is killing this process of "expor

Re: Auto completion

2013-01-11 Thread anurag.jain

in solrconfig.xml edismax text^0.5 last_name^1.0 first_name^1.2 course_name^7.0 id^10.0 branch_name^1.1 hq_passout_year^1.4 course_type^10.0 institute_name^5.0 qualification_type^5.0 mail^2.0 state_name^1.0 text 100% *:* 10

Forwarding authentication credentials in internal node-to-node requests

2013-01-11 Thread Per Steffensen

Hi I read http://wiki.apache.org/solr/SolrSecurity and know a lot about webcontainer authentication and authorization. Im sure I will be able to set it up so that each solr-node is will require HTTP authentication for (selected) incoming requests. But solr-nodes also make requests among each

Re: retrieving latest document only

2013-01-11 Thread Uwe Reh

Am 10.01.2013 11:54, schrieb jmozah: I need a query that matches only the most recent ones... Because my stats depend on it.. But I have a requirement to show **only** the latest documents and the "stats" along with it.. What do you want? 'the most recent ones' or '**only** the latest' ? Perh

Re: SolrCloud removing shard (how to not loose data)

2013-01-11 Thread mizayah

Mark, I know i still have access to data and i can woke ap shard again. What i want to do is. I have 3 shards on 3 nodes, one on each. Now i discower that i dont need 3 nodes and i want only 2. So i want to remove shard and put data from it to these who left. Is there way to index that data wit

58 matches

Mail list logo