Tri Dang: Please follow the instructions here:
https://lucene.apache.org/solr/discussion.html Best, Erick On Sun, Mar 16, 2014 at 6:15 PM, Tri Dang <tritd...@yahoo.com> wrote: > Please unsubscribe me. > > > > > > On Sunday, March 16, 2014 9:10 PM, Alexandre Rafalovitch <arafa...@gmail.com> > wrote: > > Which version of Solr are you on? Because for Solr 4, the endpoint > should be /update and the Content-Type should be correct. See: > http://wiki.apache.org/solr/UpdateCSV > > I would expect the problem NOT to be around Japanese, but around other > things. You could for example try to index Japanese into the example > collection that comes with Solr. That way you got other variables all > correct. Then, you add another field+fieldType and see if it still > works. > > Regards, > Alex. > Personal website: http://www.outerthoughts.com/ > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch > - Time is the quality of nature that keeps events from happening all > at once. Lately, it doesn't seem to be working. (Anonymous - via GTD > book) > > > > On Sat, Mar 15, 2014 at 11:50 PM, Bala Iyer <grb...@yahoo.com> wrote: >> Hi, >> >> I am new to Solr japanese. >> I added the support for japanese on schema.xml >> How can i insert Japanese text into that field either by solr client (java / >> php / ruby ) or by curl >> >> >> schema.xml >> ==================================== >> <field name="username" type="string" indexed="true" stored="true" >> multiValued="true" omitNorms="true" termVectors="true" /> >> <field name="timestamp" type="date" indexed="true" stored="true" >> multiValued="true" omitNorms="true" termVectors="true" /> >> <field name="jtxt" type="text_ja" indexed="true" stored="true" >> multiValued="true" omitNorms="true" termVectors="true" /> >> >> <fieldType name="text_ja" class="solr.TextField" >> positionIncrementGap="100" autoGeneratePhraseQueries="false"> >> <analyzer> >> <tokenizer class="solr.JapaneseTokenizerFactory" mode="search"/> >> >> <!--<tokenizer class="solr.JapaneseTokenizerFactory" mode="search" >> userDictionary="lang/userdict_ja.txt"/>--> >> <!-- Reduces inflected verbs and adjectives to their base/dictionary >> forms (辞書形) --> >> <filter class="solr.JapaneseBaseFormFilterFactory"/> >> <!-- Removes tokens with certain part-of-speech tags --> >> <filter class="solr.JapanesePartOfSpeechStopFilterFactory" >> tags="lang/stoptags_ja.txt" /> >> <!-- Normalizes full-width romaji to half-width and half-width kana >> to full-width (Unicode NFKC subset) --> >> <filter class="solr.CJKWidthFilterFactory"/> >> <!-- Removes common tokens typically not useful for search, but have >> a negative effect on ranking --> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="lang/stopwords_ja.txt" /> >> <!-- Normalizes common katakana spelling variations by removing any >> last long sound character (U+30FC) --> >> <filter class="solr.JapaneseKatakanaStemFilterFactory" >> minimumLength="4"/> >> <!-- Lower-cases romaji characters --> >> <filter class="solr.LowerCaseFilterFactory"/> >> </analyzer> >> </fieldType> >> ==================================== >> >> my insert.csv file >> >> "id","username","timestamp","content","jtxt" >> "999999999","xxxxx","2013-12-26T10:14:26Z","Hello ","マイ ドキュメント" >> ========================= >> I am trying to insert through curl it gives me error >> curl >> "http://localhost:8983/solr/collection1/update/csv?separator=,&commit=true" >> -H "Content-Type: text/plain; charset=utf-8" --data-binary @insert.csv >> >> >> ERROR >> ---------------------------- >> <?xml version="1.0" encoding="UTF-8"?> >> <response> >> <lst name="responseHeader"><int name="status">400</int><int >> name="QTime">23</int >>></lst><lst name="error"><str name="msg">Document is missing mandatory >>>uniqueKey >> field: id</str><int name="code">400</int></lst> >> </response> >> >> I know i should not use "Content-Type as text/plain" >> ========================= >> >> >> Thanks