Re: Lucene index corruption and recovery
Another sanity check. With deletion, only option would be to reindex those documents. Could someone please let me know if I am missing anything or if I am on track here. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Lucene-index-corruption-and-recovery-tp4347439p4347528.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Lucene index
Hi Dinesh, There are two ways in which you can import data from databases. 1. Use your custom code with the Solrj client library to upload documents to Solr -- http://wiki.apache.org/solr/Solrj 2. Use DataImportHandler and write data-config.xml and custom Transformers -- http://wiki.apache.org/solr/DataImportHandler Take a look at both and use the one which suits you best. On Wed, Sep 24, 2008 at 6:37 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote: > > Hi Shalin Shekhar, > > First of all thanks to you for quick replying. > > I have done the things that you have explained here > > Since I am creating indexes in multi threads and it takes 6-10 hours to > creating for approx. 3 lac products > > I am using hibernate to access DB & applying custom logic to prepare data > and putting in a map > and finally writing to index. > > Now can I achieve this. > > I am able to search by using solr web admin > but not able to add. > Please tell me how can I attach my file to you. > > Thanks > > Regards, > Dinesh Gupta > > > Date: Tue, 23 Sep 2008 19:36:22 +0530 > > From: [EMAIL PROTECTED] > > To: solr-user@lucene.apache.org > > Subject: Re: Lucene index > > > > Hi Dinesh, > > > > This seems straightforward for Solr. You can use the embedded jetty > server > > for a start. Look at the tutorial on how to get started. > > > > You'll need to modify the schema.xml to define all the fields that you > want > > to index. The wiki page at http://wiki.apache.org/solr/SchemaXml is a > good > > start on how to do that. Each field in your code will have a counterpart > in > > the schema.xml with appropriate flags (indexed/stored/tokenized etc.) > > > > Once that is complete, try to modify the DataImportHandler's hsqldb > example > > for your mysql database. > > > > On Tue, Sep 23, 2008 at 7:01 PM, Dinesh Gupta < > [EMAIL PROTECTED]>wrote: > > > > > > > > Hi Shalin Shekhar, > > > > > > Let me explain my issue. > > > > > > I have some tables in my database like > > > > > > Product > > > Category > > > Catalogue > > > Keywords > > > Seller > > > Brand > > > Country_city_group > > > etc. > > > I have a class that represent product document as > > > > > > Document doc = new Document(); > > >// Keywords which can be used directly for search > > >doc.add(new Field("id",(String) > > > data.get("PRN"),Field.Store.YES,Field.Index.UN_TOKENIZED)); > > > > > >// Sorting fields] > > >String priceString = (String) data.get("Price"); > > >if (priceString == null) > > >priceString = "0"; > > >long price = 0; > > >try { > > >price = (long) Double.parseDouble(priceString); > > >} catch (Exception e) { > > > > > >} > > > > > >doc.add(new > > > > Field("prc",NumberUtils.pad(price),Field.Store.YES,Field.Index.UN_TOKENIZED)); > > >Date createDate = (Date) data.get("CreateDate"); > > >if (createDate == null) createDate = new Date(); > > > > > >doc.add(new Field("cdt",String.valueOf(createDate.getTime()), > > > Field.Store.NO,Field.Index.UN_TOKENIZED)); > > > > > >Date modiDate = (Date) data.get("ModiDate"); > > >if (modiDate == null) modiDate = new Date(); > > > > > >doc.add(new Field("mdt",String.valueOf(modiDate.getTime()), > > > Field.Store.NO,Field.Index.UN_TOKENIZED)); > > >//doc.add(Field.UnStored("cdt", > > > String.valueOf(createDate.getTime(; > > > > > >// Additional fields for search > > >doc.add(new Field("bnm",(String) > > > data.get("Brand"),Field.Store.YES,Field.Index.TOKENIZED)); > > >doc.add(new Field("bnm1",(String) data.get("Brand1"), > Field.Store.NO > > > ,Field.Index.UN_TOKENIZED)); > > >//doc.add(Field.Text("bnm", (String) data.get("Brand"))); > > > //Tokenized and Unstored > > >doc.add(new Field("bid",(String) > > > data.get("BrandId"),Field.Store.YES,Field.Index.UN_TOKENIZED)); > > >//doc.add(Field.Keyword("bid", (String) data.get("BrandId"))); > // &g
RE: Lucene index
Hi Shalin Shekhar, First of all thanks to you for quick replying. I have done the things that you have explained here Since I am creating indexes in multi threads and it takes 6-10 hours to creating for approx. 3 lac products I am using hibernate to access DB & applying custom logic to prepare data and putting in a map and finally writing to index. Now can I achieve this. I am able to search by using solr web admin but not able to add. Please tell me how can I attach my file to you. Thanks Regards, Dinesh Gupta > Date: Tue, 23 Sep 2008 19:36:22 +0530 > From: [EMAIL PROTECTED] > To: solr-user@lucene.apache.org > Subject: Re: Lucene index > > Hi Dinesh, > > This seems straightforward for Solr. You can use the embedded jetty server > for a start. Look at the tutorial on how to get started. > > You'll need to modify the schema.xml to define all the fields that you want > to index. The wiki page at http://wiki.apache.org/solr/SchemaXml is a good > start on how to do that. Each field in your code will have a counterpart in > the schema.xml with appropriate flags (indexed/stored/tokenized etc.) > > Once that is complete, try to modify the DataImportHandler's hsqldb example > for your mysql database. > > On Tue, Sep 23, 2008 at 7:01 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote: > > > > > Hi Shalin Shekhar, > > > > Let me explain my issue. > > > > I have some tables in my database like > > > > Product > > Category > > Catalogue > > Keywords > > Seller > > Brand > > Country_city_group > > etc. > > I have a class that represent product document as > > > > Document doc = new Document(); > >// Keywords which can be used directly for search > >doc.add(new Field("id",(String) > > data.get("PRN"),Field.Store.YES,Field.Index.UN_TOKENIZED)); > > > >// Sorting fields] > >String priceString = (String) data.get("Price"); > >if (priceString == null) > >priceString = "0"; > >long price = 0; > >try { > >price = (long) Double.parseDouble(priceString); > >} catch (Exception e) { > > > >} > > > >doc.add(new > > Field("prc",NumberUtils.pad(price),Field.Store.YES,Field.Index.UN_TOKENIZED)); > >Date createDate = (Date) data.get("CreateDate"); > >if (createDate == null) createDate = new Date(); > > > >doc.add(new Field("cdt",String.valueOf(createDate.getTime()), > > Field.Store.NO,Field.Index.UN_TOKENIZED)); > > > >Date modiDate = (Date) data.get("ModiDate"); > >if (modiDate == null) modiDate = new Date(); > > > >doc.add(new Field("mdt",String.valueOf(modiDate.getTime()), > > Field.Store.NO,Field.Index.UN_TOKENIZED)); > >//doc.add(Field.UnStored("cdt", > > String.valueOf(createDate.getTime(; > > > >// Additional fields for search > >doc.add(new Field("bnm",(String) > > data.get("Brand"),Field.Store.YES,Field.Index.TOKENIZED)); > >doc.add(new Field("bnm1",(String) data.get("Brand1"),Field.Store.NO > > ,Field.Index.UN_TOKENIZED)); > >//doc.add(Field.Text("bnm", (String) data.get("Brand"))); > > //Tokenized and Unstored > >doc.add(new Field("bid",(String) > > data.get("BrandId"),Field.Store.YES,Field.Index.UN_TOKENIZED)); > >//doc.add(Field.Keyword("bid", (String) data.get("BrandId"))); // > > untokenized & > >doc.add(new Field("grp",(String) data.get("Group"),Field.Store.NO > > ,Field.Index.TOKENIZED)); > >//doc.add(Field.Text("grp", (String) data.get("Group"))); > >doc.add(new Field("gid",(String) > > data.get("GroupId"),Field.Store.YES,Field.Index.UN_TOKENIZED)); > >//doc.add(Field.Keyword("gid", (String) data.get("GroupId"))); //New > >doc.add(new Field("snm",(String) > > data.get("Seller"),Field.Store.YES,Field.Index.UN_TOKENIZED)); > >//doc.add(Field.Text("snm", (String) data.get("Seller"))); > >doc.add(new Field("sid",(String) > > data.get("SellerId"),Field.Store.YES,Field.Index.UN_TOKENIZED)); > >//doc.add(Field.Keyword("sid", (String) data.get("SellerId"))); // > > New
Re: Lucene index
ion"), > Field.Store.NO,Field.Index.TOKENIZED)); >//doc.add(Field.UnStored("sdc", (String) > data.get("SpecialDescription"),true)); >doc.add(new Field("kdc", (String) data.get("KeywordDescription"), > Field.Store.NO,Field.Index.TOKENIZED)); >//doc.add(Field.UnStored("kdc", (String) > data.get("KeywordDescription"),true)); > >// ColumnB - Product Category and parent categories >doc.add(new Field("cts",(String) > data.get("Categories"),Field.Store.YES,Field.Index.TOKENIZED)); >//doc.add(Field.Text("cts", (String) data.get("Categories"))); > >// ColumnB - Product Category and parent categories //Raman >doc.add(new Field("dct",(String) > data.get("DirectCategories"),Field.Store.YES,Field.Index.TOKENIZED)); >//doc.add(Field.Text("dct", (String) data.get("DirectCategories"))); > >// ColumnC - Product Catalogues >doc.add(new Field("clg",(String) > data.get("Catalogues"),Field.Store.YES,Field.Index.TOKENIZED)); >//doc.add(Field.Text("clg", (String) data.get("Catalogues"))); > >//Product Delivery Cities >doc.add(new Field("dcty",(String) > data.get("DelCities"),Field.Store.YES,Field.Index.TOKENIZED)); >// Additional Information >//Top Selling Count >String sellerCount=((Long)data.get("SellCount")).toString(); >doc.add(new > Field("bsc",sellerCount,Field.Store.YES,Field.Index.TOKENIZED)); > > >I am preparing data from querying databse. > Please tell me how can I migrate my logic to Solr. > I have spend more than a week. > But have got nothing. > Please help me. > > Can I attach my files here? > > Thanks in Advance > > Regards > Dinesh Gupta > > > Date: Tue, 23 Sep 2008 18:53:07 +0530 > > From: [EMAIL PROTECTED] > > To: solr-user@lucene.apache.org > > Subject: Re: Lucene index > > > > On Tue, Sep 23, 2008 at 5:33 PM, Dinesh Gupta < > [EMAIL PROTECTED]>wrote: > > > > > > > > Hi, > > > Current we are using Lucene api to create index. > > > > > > It creates index in a directory with 3 files like > > > > > > xxx.cfs , deletable & segments. > > > > > > If I am creating Lucene indexes from Solr, these file will be created > or > > > not? > > > > > > The lucene index will be created in the solr_home inside the data/index > > directory. > > > > > > > Please give me example on MySQL data base instead of hsqldb > > > > > > > If you are talking about DataImportHandler then there is no difference in > > the configuration except for using the MySql driver instead of hsqldb. > > > > -- > > Regards, > > Shalin Shekhar Mangar. > > _ > Want to explore the world? Visit MSN Travel for the best deals. > http://in.msn.com/coxandkings > -- Regards, Shalin Shekhar Mangar.
RE: Lucene index
atalogues doc.add(new Field("clg",(String) data.get("Catalogues"),Field.Store.YES,Field.Index.TOKENIZED)); //doc.add(Field.Text("clg", (String) data.get("Catalogues"))); //Product Delivery Cities doc.add(new Field("dcty",(String) data.get("DelCities"),Field.Store.YES,Field.Index.TOKENIZED)); // Additional Information //Top Selling Count String sellerCount=((Long)data.get("SellCount")).toString(); doc.add(new Field("bsc",sellerCount,Field.Store.YES,Field.Index.TOKENIZED)); I am preparing data from querying databse. Please tell me how can I migrate my logic to Solr. I have spend more than a week. But have got nothing. Please help me. Can I attach my files here? Thanks in Advance Regards Dinesh Gupta > Date: Tue, 23 Sep 2008 18:53:07 +0530 > From: [EMAIL PROTECTED] > To: solr-user@lucene.apache.org > Subject: Re: Lucene index > > On Tue, Sep 23, 2008 at 5:33 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote: > > > > > Hi, > > Current we are using Lucene api to create index. > > > > It creates index in a directory with 3 files like > > > > xxx.cfs , deletable & segments. > > > > If I am creating Lucene indexes from Solr, these file will be created or > > not? > > > The lucene index will be created in the solr_home inside the data/index > directory. > > > > Please give me example on MySQL data base instead of hsqldb > > > > If you are talking about DataImportHandler then there is no difference in > the configuration except for using the MySql driver instead of hsqldb. > > -- > Regards, > Shalin Shekhar Mangar. _ Want to explore the world? Visit MSN Travel for the best deals. http://in.msn.com/coxandkings
Re: Lucene index
On Tue, Sep 23, 2008 at 5:33 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote: > > Hi, > Current we are using Lucene api to create index. > > It creates index in a directory with 3 files like > > xxx.cfs , deletable & segments. > > If I am creating Lucene indexes from Solr, these file will be created or > not? The lucene index will be created in the solr_home inside the data/index directory. > Please give me example on MySQL data base instead of hsqldb > If you are talking about DataImportHandler then there is no difference in the configuration except for using the MySql driver instead of hsqldb. -- Regards, Shalin Shekhar Mangar.
RE: Lucene index verifier
Given the size of our index, using file checksums is more feasible. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Friday, February 08, 2008 5:10 AM To: solr-user@lucene.apache.org Subject: Re: Lucene index verifier If someone wanted those additional checks, it seems like the right place to hook it in would be the snapshooter or snapinstaller. -Yonik On Feb 8, 2008 8:04 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > I think Mike M. put up a tool called CheckIndex that is a simple > driver program that checks for corruption. However, my understanding > is that he isn't sure it is complete just yet, but it is a start. > Have a look in the latest release. > > Maybe it would be useful to have it run either on startup or > periodically in Solr (if configured to do so). I haven't tried it, so > I don't know what effect it has on performance/search/indexing. > > -Grant > > > On Feb 7, 2008, at 11:15 PM, Lance Norskog wrote: > > > (Sorry, my Lucene java-user access is wonky.) > > > > I would like to verify that my snapshots are not corrupt before I > > enable them. > > > > What is the simplest program to verify that a Lucene index is not > > corrupt? > > > > Or, what is a Solr query that will verify that there is no > > corruption? With the minimum amount of time? > > > > Thanks, > > > > Lance Norskog > > >
Re: Lucene index verifier
If someone wanted those additional checks, it seems like the right place to hook it in would be the snapshooter or snapinstaller. -Yonik On Feb 8, 2008 8:04 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > I think Mike M. put up a tool called CheckIndex that is a simple > driver program that checks for corruption. However, my understanding > is that he isn't sure it is complete just yet, but it is a start. > Have a look in the latest release. > > Maybe it would be useful to have it run either on startup or > periodically in Solr (if configured to do so). I haven't tried it, so > I don't know what effect it has on performance/search/indexing. > > -Grant > > > On Feb 7, 2008, at 11:15 PM, Lance Norskog wrote: > > > (Sorry, my Lucene java-user access is wonky.) > > > > I would like to verify that my snapshots are not corrupt before I > > enable > > them. > > > > What is the simplest program to verify that a Lucene index is not > > corrupt? > > > > Or, what is a Solr query that will verify that there is no > > corruption? With > > the minimum amount of time? > > > > Thanks, > > > > Lance Norskog > > >
Re: Lucene index verifier
I think Mike M. put up a tool called CheckIndex that is a simple driver program that checks for corruption. However, my understanding is that he isn't sure it is complete just yet, but it is a start. Have a look in the latest release. Maybe it would be useful to have it run either on startup or periodically in Solr (if configured to do so). I haven't tried it, so I don't know what effect it has on performance/search/indexing. -Grant On Feb 7, 2008, at 11:15 PM, Lance Norskog wrote: (Sorry, my Lucene java-user access is wonky.) I would like to verify that my snapshots are not corrupt before I enable them. What is the simplest program to verify that a Lucene index is not corrupt? Or, what is a Solr query that will verify that there is no corruption? With the minimum amount of time? Thanks, Lance Norskog