Re: [R] Problem with Snowball RWeka
Le jeudi 12 janvier 2012 à 15:18 -0800, plecto a écrit : Thanks! I read your shoer term solution and thanks to it was able to make stemming working in R for Mac OS X. I actually used Sys.setenv(NOAWT=TRUE) instead of Sys.setenv(NOAWT, true), as the latter produces the following error message: Error in Sys.setenv(NOAWT, true) : all arguments must be named. For people that might bump on this thread later while searching for help: if this solution did not work for you, make sure you run Sys.setenv(NOAWT= true) *before* loading Snowball/RWeka/rJava via library(). Else it won't have any effect. Hope this will save you some time ;-) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with Snowball RWeka
Thanks! I read your shoer term solution and thanks to it was able to make stemming working in R for Mac OS X. I actually used Sys.setenv(NOAWT=TRUE) instead of Sys.setenv(NOAWT, true), as the latter produces the following error message: Error in Sys.setenv(NOAWT, true) : all arguments must be named. -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-Snowball-RWeka-tp3402126p4290844.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with Snowball RWeka
The Java error when attempting to use the stemmers in the Snowball or tm packages on Windows machines is caused by Quicktime. See prior posts in this thread. The workaround is to uninstall Quicktime. After much trial and error on machines spanning WinXP/2k/Vista/7, I finally verified this as follows: 1) Fresh installation of Windows/Java/R. Snowball package works perfectly. 2) Install Quicktime. Java errors produced when attempting to use Snowball package. 3) Uninstall Quicktime. Snowball package works perfectly again. Many thanks to profs Ligges, Hornik, and Feinerer for their kind help in diagnosing this. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with Snowball RWeka
I too have this problem. Everything worked fine last year, but after updating R and packages I can no longer do word stemming. Unfortunately, I didn't save the old binaries, otherwise I would just revert back. Hoping someone finds a solution for R on Windows. Thanks! There is a potential solution for R on Mac OS from Kurt Hornik copied below, but I cannot get this to work on Windows. Here's the code I'm running: #1) Using package Snowball library(Snowball) source - readLines(system.file(words, porter,voc.txt,package = Snowball)) result - SnowballStemmer(source) #2) Using package tm library(tm) data(crude) stemDocument(crude[[1]]) In both instances I got a Java error Could not initialize the GenericPropertiesCreator. This exception was produced: java.lang.NullPointerException. After receiving this error once in the session, no further error messages are generated. However, SnowballStemmer() and stemDocument() return the original unstemmed text. Possible Solution: For those on Mac OS, Kurt Hornik wrote... These issues seem to be specific to Mac OS X. Recent versions of Weka have added a package management system not unlike R's, to the effect that now when external packages (or the Snowball jar) is loaded their KnowledgeFlow GUI is started, which in turn requires AWT---and from what I understand, this does not work on Mac OS X. Short term, you should be able to Sys.setenv(NOAWT, true). More long term, the Weka maintainers have patched their upstream code so that it is possible to turn off the dynamic class discovery altogether, but I have not found the time to test this ... I realize this solution was for Mac OS, but not knowing anything about rJava I tried this on Windows anyways resulting in Error in Sys.setenv(NOAWT, true) : all arguments must be named Here's my session info. R version 2.13.0 Patched (2011-04-21 r55576) Platform: i386-pc-mingw32/i386 (32-bit) (Windows Vista) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] Snowball_0.0-7 tm_0.5-6 rcom_2.2-3.1 rscproxy_1.3-1 loaded via a namespace (and not attached): [1] grid_2.13.0 rJava_0.9-0 (same error with multiple older versions) RWeka_0.4-7 RWekajars_3.7.3-1 [5] slam_0.1-22 tools_2.13.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with Snowball RWeka
Greetings to all, I have a similar issue with Snowball. I am runing R version 2.12.1 (2010-12-16) on windows 7 Here is my script : library(tm) custom.xml - system.file(texts, custom.xml, package = tm) print(readLines(custom.xml), quote = FALSE) myXMLReader - readXML( spec = list( Language = list(node, /document/language), DateTimeStamp = list(node, /document/date), Origin = list(node, /document/source), Description = list(node, /document/subject), Type = list(node, /document/country), Heading = list(node, /document/title), Content = list(node, /document/contenu), Author = list(node, /document/author)), doc = PlainTextDocument()) mySource - function(x, encoding = UTF-8) XMLSource(x, function(tree) XML::xmlRoot(tree)$children, myXMLReader, encoding) corpusmf - Corpus(mySource(custom.xml)) meta(corpusmf[[1]]) meta(corpusmf[[2]]) corpusmf - tm_map(corpusmf, stripWhitespace) corpusmf - tm_map(corpusmf, removeNumbers) corpusmf - tm_map(corpusmf, removePunctuation) corpusmf - tm_map(corpusmf,stemDocument) matrix - TermDocumentMatrix(corpusmf,control=list(weighting =weightBin )) print(matrix) - stemDocument returns an error message : Stemmer 'porter' unknown! Stemmer 'english' unknown! Stemmer 'porter' unknown! Stemmer 'english' unknown! I tried to invoke library(Snowball) before, but it's the same. I found a clue on Weka website http://weka.wikispaces.com/The+snowball+stemmers+don%27t+work,+what+am+I+doing+wrong%3F but I don't understand what I should do with this archives I would be grateful if someone could help on this; Kind regards, -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-Snowball-RWeka-tp3402126p3569089.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with Snowball RWeka
Dear Forum, I have also problems Snowball (macos, fresh install of rJava, RWeka, RWekajars, Snowball from http://cran.ch.r-project.org/bin/macosx/leopard/ ): library(Snowball) example(SnowballStemmer) SnwblS ## Test the supplied vocabulary for the default stemmer ('porter'): SnwblS source - readLines(system.file(words, porter,voc.txt, SnwblS+ package = Snowball)) SnwblS result - SnowballStemmer(source) Erreur dans .jnew(name) : java.lang.InternalError: Can't start the AWT because Java was started on the first thread. Make sure StartOnFirstThread is not specified in your application's Info.plist or on the command line Trying to add database driver (JDBC): RmiJdbc.RJDriver - Warning, not in CLASSPATH? Trying to add database driver (JDBC): jdbc.idbDriver - Warning, not in CLASSPATH? Trying to add database driver (JDBC): org.gjt.mm.mysql.Driver - Warning, not in CLASSPATH? Trying to add database driver (JDBC): com.mckoi.JDBCDriver - Warning, not in CLASSPATH? Trying to add database driver (JDBC): org.hsqldb.jdbcDriver - Warning, not in CLASSPATH? example(SnowballStemmer) SnwblS ## Test the supplied vocabulary for the default stemmer ('porter'): SnwblS source - readLines(system.file(words, porter,voc.txt, SnwblS+ package = Snowball)) SnwblS result - SnowballStemmer(source) SnwblS target - readLines(system.file(words, porter, output.txt, SnwblS+ package = Snowball)) SnwblS ## Any differences? SnwblS any(result != target) [1] TRUE Stemmer 'porter' unknown! sessionInfo() R version 2.12.1 (2010-12-16) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] fr_FR/fr_FR/fr_FR/C/fr_FR/fr_FR attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Snowball_0.0-7 loaded via a namespace (and not attached): [1] grid_2.12.1 rJava_0.8-8 RWeka_0.4-3 RWekajars_3.7.3-1 tools_2.12.1 Any hints? Thanks! Le 24 mars 2011 à 12:31, Mike Marchywka a écrit : Date: Thu, 24 Mar 2011 03:35:31 -0700 From: kont...@alexanderbachmann.de To: r-help@r-project.org Subject: [R] Problem with Snowball RWeka Dear Forum, when I try to use SnowballStemmer() I get the following error message: Could not initialize the GenericPropertiesCreator. This exception was produced: java.lang.NullPointerException It seems to have something to do with either Snowball or RWeka, however I can't figure out, what to do myself. If you could spend 5 minutes of your valuable time, to help me or give me a hint where to look for, it would be very much appreciated. Thank you very much. If you only want answers from people who have encountered this exact problem before then that's great but you are more likely to get a useful response if you include reproducible code and some data to produce the error you have seen. Sometimes I investigate these things because they involve a package or objective I wanted to look at anyway. It could be that the only problem is that the OP missed something in documentation or had typo etc. In this case, to pursue it from the perspective of debuggin the code, you probably want to find some way to get a stack trace and then find out which java variable was null and relate it back to how you invoked it. This likely points to a missing object in your call or maybe the installation lacked a dependency as this occured during init, but hard to speculate with what you have provided. You could try reinstalling and check for errors. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jean-Pierre Müller SSP / Anthropole 4123 / UNIL / CH - 1015 Lausanne Voice:+41 21 692 3116 / Fax:+41 21 692 3115 Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html S'il vous plaît, évitez de m'envoyer des attachements au format Word ou PowerPoint. Voir http://www.gnu.org/philosophy/no-word-attachments.fr.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with Snowball RWeka
Dear Forum, when I try to use SnowballStemmer() I get the following error message: Could not initialize the GenericPropertiesCreator. This exception was produced: java.lang.NullPointerException It seems to have something to do with either Snowball or RWeka, however I can't figure out, what to do myself. If you could spend 5 minutes of your valuable time, to help me or give me a hint where to look for, it would be very much appreciated. Thank you very much. -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-Snowball-RWeka-tp3402126p3402126.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with Snowball RWeka
Date: Thu, 24 Mar 2011 03:35:31 -0700 From: kont...@alexanderbachmann.de To: r-help@r-project.org Subject: [R] Problem with Snowball RWeka Dear Forum, when I try to use SnowballStemmer() I get the following error message: Could not initialize the GenericPropertiesCreator. This exception was produced: java.lang.NullPointerException It seems to have something to do with either Snowball or RWeka, however I can't figure out, what to do myself. If you could spend 5 minutes of your valuable time, to help me or give me a hint where to look for, it would be very much appreciated. Thank you very much. If you only want answers from people who have encountered this exact problem before then that's great but you are more likely to get a useful response if you include reproducible code and some data to produce the error you have seen. Sometimes I investigate these things because they involve a package or objective I wanted to look at anyway. It could be that the only problem is that the OP missed something in documentation or had typo etc. In this case, to pursue it from the perspective of debuggin the code, you probably want to find some way to get a stack trace and then find out which java variable was null and relate it back to how you invoked it. This likely points to a missing object in your call or maybe the installation lacked a dependency as this occured during init, but hard to speculate with what you have provided. You could try reinstalling and check for errors. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.