>From my tests, this only affects Windows XP and previous. *Nix and OSX use always full charset.jar. Windows Vista and Windows 7 by default "support" all languages and report this back through http://msdn.microsoft.com/en-us/library/dd317827(v=vs.85).aspx , so the "testing code" in the installer gets back true for all language groups and is forced to install full charsets.jar. This is described in the Sun issue and I verified that at least on Vista and 7 Ultimate - it seems to install full language support even on German Windows - in contrast to XP which installs no charsets.jar (jre/lib folder).
----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [email protected] > -----Original Message----- > From: Jan Høydahl [mailto:[email protected]] > Sent: Monday, April 04, 2011 6:25 PM > To: [email protected] > Subject: Re: Unsupported encoding GB18030 > > Makes sense. > > Question is, do we want to require full JDK to index exampledocs? Most > developers will have a JDK, but the occasional semi-tech manager just > wanting to test out Solr may get burnt and think "Open Source sucks, just as I > thought" :) > > I added a note to http://wiki.apache.org/solr/SolrInstall about the need for > JDK for international charsets.. > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > > On 4. apr. 2011, at 17.06, Uwe Schindler wrote: > > > To come back to the original issue: > > If you are using a pure JRE installed in your operating system using > > the standard mechanism "browser automatically installs Java Plugin > > methods" or similar, the following applies: > > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6329080 > > > > To reduce size of downloads, the JRE-only installation does not > > contain the full charsets.jar, so the problem is expected. In fact, > > those JRE's only contain the basic charsets as Robert told and the > > ones needed for your area (it analyzes your environment in the > > installer and chooses between western, eastern and possibly others to > > download only the corresponding charsets.jar). > > > > We should maybe add a note to Solr, that you should in all cases use a > > full locale JRE installation or better a JDK, else the full > > international functionality of Solr cannot be used. > > > > Uwe > > > > ----- > > Uwe Schindler > > H.-H.-Meier-Allee 63, D-28213 Bremen > > http://www.thetaphi.de > > eMail: [email protected] > > > > > >> -----Original Message----- > >> From: Jan Høydahl [mailto:[email protected]] > >> Sent: Monday, April 04, 2011 1:37 PM > >> To: [email protected] > >> Subject: Re: Unsupported encoding GB18030 > >> > >>>>> : I don't see the reason why "exampledocs" should contain docs > >>>>> with narrow charsets not guaranteed to be supported. > >>>> personally i would like to see us add a lot more exampledocs in a > >>>> lot more esoteric encodings, precisely to help end users sanity > >>>> test this sort of we frequetnly get questions form people about > >>>> character encoding wonkiness, and things like test_utf8.sh, > >>>> utf8-example.xml, and now gb18030-example.xml can help us narrow > down the problem: > >>>> their client code, their servlet container, or solr? > >>> > >>> Same here. In my opinion, an example set of files should also > >>> contain "more complicated" ones to show what Solr can do. If some of > >>> them don't work, it's not really a problem. Maybe we should simply > >>> add a "tag" to the filename to mark them as not working in every > > configuration. > >> > >> Positive to more example docs! > >> > >> My concern was that since indexing exampledocs/*.xml is perhaps THE > >> most common action any new Solr user will do, it should just work, > >> and it's a benefit if the results revolve around the same theme, a > >> set of products > > with > >> category and prices. We definitely want to show off more advanced > >> features, and we should add more example documents for that. Plain > >> test docs could be placed in a a subfolder "exampledocs/extras" or > something. > >> > >> Regarding the WindowsXP VMmware I was using, it had a Sun JRE (not > >> JDK) which was auto-updated from 1.5 to 1.6. > >> After completely uninstalling Java and re-installing > >> jdk-6u24-windows- i586.exe the GB18030 encoding is supported. > >> > >> -- > >> Jan Høydahl, search solution architect Cominvent AS - > >> www.cominvent.com > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [email protected] For > >> additional commands, e-mail: [email protected] > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] For > > additional commands, e-mail: [email protected] > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] For additional > commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
