Funny, it works fine if I use VectorSource
ovid <- Corpus(VectorSource(list.files(sourceDir)[1:1253]), readerControl =
list(language = "lat"))
So I tried only executing > DirDource(sourceDir) and that fails with the
error i mentioned earlier. So its not a problem with Corpus() which I
thought initially it was.

Also, I noticed that VectorSource works way more faster than having a
DirSource there.
Any particular reason ?


On Sat, Aug 17, 2013 at 11:16 AM, Ajinkya Kale <kaleajin...@gmail.com>wrote:

> It contains all text files which were converted from doc, docx, ppt etc.
> using libreoffice.
> Some of them are non-english text documents.
>
> Sorry I cannot share the corpus.. but if someone can shed light on what
> might cause this error then I can try to eliminate those documents if some
> specific docs are causing it.
>
>
> On Sat, Aug 17, 2013 at 9:55 AM, Milan Bouchet-Valat <nalimi...@club.fr>wrote:
>
>> Le vendredi 16 août 2013 à 19:35 -0700, Ajinkya Kale a écrit :
>> > I am trying to use the text mining package ... I keep getting this
>> error :
>> >
>> > rm(list=ls())
>> > library(tm)
>> > sourceDir <- "Z:\\projectk_viz\\docs_to_index"
>> > ovid <- Corpus(DirSource(sourceDir),readerControl = list(language =
>> "lat"))
>> >
>> > Error in if (vectorized && (length <= 0)) stop("vectorized sources must
>> > have positive length") : missing value where TRUE/FALSE needed
>> >
>> > I am not sure what it means.
>> The posting guide asks for a reproducible example. If you cannot make
>> available to us the contents of sourceDir, at least you should tell us
>> what kind of files it contains. Have you tried with only some of the
>> files the directory contains ?
>>
>>
>> Regards
>>
>> > --ajinkya
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>
> --
>
> Sincerely,
> Ajinkya
> http://ajinkya.info
>



-- 

Sincerely,
Ajinkya
http://ajinkya.info

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to