Funny, it works fine if I use VectorSource ovid <- Corpus(VectorSource(list.files(sourceDir)[1:1253]), readerControl = list(language = "lat")) So I tried only executing > DirDource(sourceDir) and that fails with the error i mentioned earlier. So its not a problem with Corpus() which I thought initially it was.
Also, I noticed that VectorSource works way more faster than having a DirSource there. Any particular reason ? On Sat, Aug 17, 2013 at 11:16 AM, Ajinkya Kale <kaleajin...@gmail.com>wrote: > It contains all text files which were converted from doc, docx, ppt etc. > using libreoffice. > Some of them are non-english text documents. > > Sorry I cannot share the corpus.. but if someone can shed light on what > might cause this error then I can try to eliminate those documents if some > specific docs are causing it. > > > On Sat, Aug 17, 2013 at 9:55 AM, Milan Bouchet-Valat <nalimi...@club.fr>wrote: > >> Le vendredi 16 août 2013 à 19:35 -0700, Ajinkya Kale a écrit : >> > I am trying to use the text mining package ... I keep getting this >> error : >> > >> > rm(list=ls()) >> > library(tm) >> > sourceDir <- "Z:\\projectk_viz\\docs_to_index" >> > ovid <- Corpus(DirSource(sourceDir),readerControl = list(language = >> "lat")) >> > >> > Error in if (vectorized && (length <= 0)) stop("vectorized sources must >> > have positive length") : missing value where TRUE/FALSE needed >> > >> > I am not sure what it means. >> The posting guide asks for a reproducible example. If you cannot make >> available to us the contents of sourceDir, at least you should tell us >> what kind of files it contains. Have you tried with only some of the >> files the directory contains ? >> >> >> Regards >> >> > --ajinkya >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >> > > > -- > > Sincerely, > Ajinkya > http://ajinkya.info > -- Sincerely, Ajinkya http://ajinkya.info [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.