no it is fully distributed testing. It is ok, StatEt handles log4j logging for me so i see the logs. I was wondering if any end-to-end diagnostics is already embedded in Crunch but reporting backend errors to front end is notoriously hard (and sometimes, impossible) with hadoop, so I assume it doesn't make sense to report client-only stuff thru exception while the other stuff still requires checking isSucceeded().
On Fri, Nov 16, 2012 at 11:07 AM, Josh Wills <[email protected]> wrote: > Are you running this using LocalJobRunner? Does calling > Pipeline.enableDebug() before run() help? If it doesn't, it'll help > settle a debate I'm having w/Matthias. ;-) > > On Fri, Nov 16, 2012 at 10:22 AM, Dmitriy Lyubimov <[email protected]> > wrote: > > I see the error in the logs but Pipeline.run() has never thrown anything. > > isSucceeded() subsequently returns false. Is there any way to extract > > client-side problem rather than just being able to state that job failed? > > or it is ok and the only diagnostics by design? > > > > ============ > > 68124 [Thread-8] INFO org.apache.crunch.impl.mr.exec.CrunchJob - > > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path > > does not exist: hdfs://localhost:11010/crunchr-example/input > > at > > > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231) > > at > > > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:248) > > at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944) > > at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961) > > at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170) > > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880) > > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) > > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) > > at org.apache.hadoop.mapreduce.Job.submit(Job.java:476) > > at > > > org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:331) > > at org.apache.crunch.impl.mr.exec.CrunchJob.submit(CrunchJob.java:135) > > at > > > org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:251) > > at > > > org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.run(CrunchJobControl.java:279) > > at java.lang.Thread.run(Thread.java:662) > > > > > > On Mon, Nov 12, 2012 at 5:41 PM, Dmitriy Lyubimov <[email protected]> > wrote: > > > >> for hadoop nodes i guess yet another option to soft-link the .so into > >> hadoop's native lib folder > >> > >> > >> On Mon, Nov 12, 2012 at 5:37 PM, Dmitriy Lyubimov <[email protected] > >wrote: > >> > >>> I actually want to defer this to hadoop admins, we just need to create > a > >>> procedure for setting up nodes. Ideally as simple as possible. > something > >>> like > >>> > >>> 1) setup R > >>> 2) install.packages("rJava","RProtoBuf","crunchR") > >>> 3) R CMD javareconf > >>> 3) add result of R --vanilla <<< 'system.file("jri", package="rJava") > to > >>> either mapred command lines or LD_LIBRARY_PATH... > >>> > >>> but it will depend on their versions of hadoop, jre etc. I hoped crunch > >>> might have something to hide a lot of that complexity (since it is > about > >>> hiding complexities, for the most part :) ) besides hadoop has a way > to > >>> ship .so's to the backend so if crunch had an api to do something > similar > >>> it is conceivable that driver might yank and ship it too to hide that > >>> complexity as well. But then there's a host of issues how to handle > >>> potentially different rJava versions installed on different nodes... > So, it > >>> increasingly looks like something we might want to defer to sysops to > do > >>> with approximate set of requirements . > >>> > >>> > >>> On Mon, Nov 12, 2012 at 5:29 PM, Josh Wills <[email protected]> > wrote: > >>> > >>>> On Mon, Nov 12, 2012 at 5:17 PM, Dmitriy Lyubimov <[email protected]> > >>>> wrote: > >>>> > >>>> > so java tasks need to be able to load libjri.so from > >>>> > whatever system.file("jri", package="rJava") says. > >>>> > > >>>> > Traditionally, these issues were handled with -Djava.library.path. > >>>> > Apparently there's nothing java task can do to enable loadLibrary() > >>>> command > >>>> > to see the damn library once started. But -Djava.library.path > requires > >>>> for > >>>> > nodes to configure and lock jvm command line from modifications of > the > >>>> > client. which is fine. > >>>> > > >>>> > I also discovered that LD_LIBRARY_PATH actually works with jre 1.6 > >>>> (again). > >>>> > > >>>> > but... any other suggestions about best practice configuring crunch > to > >>>> run > >>>> > user's .so's? > >>>> > > >>>> > >>>> Not off the top of my head. I suspect that whatever you come up with > will > >>>> become the "best practice." :) > >>>> > >>>> > > >>>> > thanks. > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > On Sun, Nov 11, 2012 at 1:41 PM, Josh Wills <[email protected]> > >>>> wrote: > >>>> > > >>>> > > I believe that is a safe assumption, at least right now. > >>>> > > > >>>> > > > >>>> > > On Sun, Nov 11, 2012 at 1:38 PM, Dmitriy Lyubimov < > [email protected] > >>>> > > >>>> > > wrote: > >>>> > > > >>>> > > > Question. > >>>> > > > > >>>> > > > So in Crunch api, initialize() doesn't get an emitter. and the > >>>> process > >>>> > > gets > >>>> > > > emitter every time. > >>>> > > > > >>>> > > > However, my guess any single reincranation of a DoFn object in > the > >>>> > > backend > >>>> > > > will always be getting the same emitter thru its lifecycle. Is > it > >>>> an > >>>> > > > admissible assumption or there's currently a counter example to > >>>> that? > >>>> > > > > >>>> > > > The problem is that as i implement the two way pipeline of input > >>>> and > >>>> > > > emitter data between R and Java, I am bulking these calls > together > >>>> for > >>>> > > > performance reasons. Each individual datum in these chunks of > data > >>>> will > >>>> > > not > >>>> > > > have attached emitter function information to them in any way. > >>>> (well it > >>>> > > > could but it would be a performance killer and i bet emitter > never > >>>> > > > changes). > >>>> > > > > >>>> > > > So, thoughts? can i assume emitter never changes between first > and > >>>> lass > >>>> > > > call to DoFn instance? > >>>> > > > > >>>> > > > thanks. > >>>> > > > > >>>> > > > > >>>> > > > On Mon, Oct 29, 2012 at 6:32 PM, Dmitriy Lyubimov < > >>>> [email protected]> > >>>> > > > wrote: > >>>> > > > > >>>> > > > > yes... > >>>> > > > > > >>>> > > > > i think it worked for me before, although just adding all jars > >>>> from R > >>>> > > > > package distribution would be a little bit more appropriate > >>>> approach > >>>> > > > > -- but it creates a problem with jars in dependent R > packages. I > >>>> > think > >>>> > > > > it would be much easier to just compile a hadoop-job file and > >>>> stick > >>>> > it > >>>> > > > > in rather than doing cherry-picking of individual jars from > who > >>>> knows > >>>> > > > > how many locations. > >>>> > > > > > >>>> > > > > i think i used the hadoop job format with distributed cache > >>>> before > >>>> > and > >>>> > > > > it worked... at least with Pig "register jar" functionality. > >>>> > > > > > >>>> > > > > ok i guess i will just try if it works. > >>>> > > > > > >>>> > > > > On Mon, Oct 29, 2012 at 6:24 PM, Josh Wills < > [email protected] > >>>> > > >>>> > > wrote: > >>>> > > > > > On Mon, Oct 29, 2012 at 5:46 PM, Dmitriy Lyubimov < > >>>> > [email protected] > >>>> > > > > >>>> > > > > wrote: > >>>> > > > > > > >>>> > > > > >> Great! so it is in Crunch. > >>>> > > > > >> > >>>> > > > > >> does it support hadoop-job jar format or only pure java > jars? > >>>> > > > > >> > >>>> > > > > > > >>>> > > > > > I think just pure jars-- you're referring to hadoop-job > format > >>>> as > >>>> > > > having > >>>> > > > > > all the dependencies in a lib/ directory within the jar? > >>>> > > > > > > >>>> > > > > > > >>>> > > > > >> > >>>> > > > > >> On Mon, Oct 29, 2012 at 5:10 PM, Josh Wills < > >>>> [email protected]> > >>>> > > > > wrote: > >>>> > > > > >> > On Mon, Oct 29, 2012 at 5:04 PM, Dmitriy Lyubimov < > >>>> > > > [email protected]> > >>>> > > > > >> wrote: > >>>> > > > > >> > > >>>> > > > > >> >> I think i need functionality to add more jars (or > external > >>>> > > > > hadoop-jar) > >>>> > > > > >> >> to drive that from an R package. Just setting job jar by > >>>> class > >>>> > is > >>>> > > > not > >>>> > > > > >> >> enough. I can push overall job-jar as an addiitonal jar > to > >>>> R > >>>> > > > package; > >>>> > > > > >> >> however, i cannot really run hadoop command line on it, > i > >>>> need > >>>> > to > >>>> > > > set > >>>> > > > > >> >> up classpath thru RJava. > >>>> > > > > >> >> > >>>> > > > > >> >> Traditional single hadoop job jar will unlikely work > here > >>>> since > >>>> > > we > >>>> > > > > >> >> cannot hardcode pipelines in java code but rather have > to > >>>> > > construct > >>>> > > > > >> >> them on the fly. (well, we could serialize pipeline > >>>> definitions > >>>> > > > from > >>>> > > > > R > >>>> > > > > >> >> and then replay them in a driver -- but that's too > >>>> cumbersome > >>>> > and > >>>> > > > > more > >>>> > > > > >> >> work than it has to be.) There's no reason why i > shouldn't > >>>> be > >>>> > > able > >>>> > > > to > >>>> > > > > >> >> do pig-like "register jar" or "setJobJar" (mahout-like) > >>>> when > >>>> > > > kicking > >>>> > > > > >> >> off a pipeline. > >>>> > > > > >> >> > >>>> > > > > >> > > >>>> > > > > >> > o.a.c.util.DistCache.addJarToDistributedCache? > >>>> > > > > >> > > >>>> > > > > >> > > >>>> > > > > >> >> > >>>> > > > > >> >> > >>>> > > > > >> >> On Mon, Oct 29, 2012 at 10:17 AM, Dmitriy Lyubimov < > >>>> > > > > [email protected]> > >>>> > > > > >> >> wrote: > >>>> > > > > >> >> > Ok, sounds very promising... > >>>> > > > > >> >> > > >>>> > > > > >> >> > i'll try to start digging on the driver part this week > >>>> then > >>>> > > > > (Pipeline > >>>> > > > > >> >> > wrapper in R5). > >>>> > > > > >> >> > > >>>> > > > > >> >> > On Sun, Oct 28, 2012 at 11:56 AM, Josh Wills < > >>>> > > > [email protected] > >>>> > > > > > > >>>> > > > > >> >> wrote: > >>>> > > > > >> >> >> On Fri, Oct 26, 2012 at 2:40 PM, Dmitriy Lyubimov < > >>>> > > > > [email protected] > >>>> > > > > >> > > >>>> > > > > >> >> wrote: > >>>> > > > > >> >> >>> Ok, cool. > >>>> > > > > >> >> >>> > >>>> > > > > >> >> >>> So what state is Crunch in? I take it is in a fairly > >>>> > advanced > >>>> > > > > state. > >>>> > > > > >> >> >>> So every api mentioned in the FlumeJava paper is > >>>> working , > >>>> > > > > right? > >>>> > > > > >> Or > >>>> > > > > >> >> >>> there's something that is not working specifically? > >>>> > > > > >> >> >> > >>>> > > > > >> >> >> I think the only thing in the paper that we don't > have > >>>> in a > >>>> > > > > working > >>>> > > > > >> >> >> state is MSCR fusion. It's mostly just a question of > >>>> > > > prioritizing > >>>> > > > > it > >>>> > > > > >> >> >> and getting the work done. > >>>> > > > > >> >> >> > >>>> > > > > >> >> >>> > >>>> > > > > >> >> >>> On Fri, Oct 26, 2012 at 2:31 PM, Josh Wills < > >>>> > > > [email protected] > >>>> > > > > > > >>>> > > > > >> >> wrote: > >>>> > > > > >> >> >>>> Hey Dmitriy, > >>>> > > > > >> >> >>>> > >>>> > > > > >> >> >>>> Got a fork going and looking forward to playing > with > >>>> > crunchR > >>>> > > > > this > >>>> > > > > >> >> weekend-- > >>>> > > > > >> >> >>>> thanks! > >>>> > > > > >> >> >>>> > >>>> > > > > >> >> >>>> J > >>>> > > > > >> >> >>>> > >>>> > > > > >> >> >>>> On Wed, Oct 24, 2012 at 1:28 PM, Dmitriy Lyubimov < > >>>> > > > > >> [email protected]> > >>>> > > > > >> >> wrote: > >>>> > > > > >> >> >>>> > >>>> > > > > >> >> >>>>> Project template > >>>> https://github.com/dlyubimov/crunchR > >>>> > > > > >> >> >>>>> > >>>> > > > > >> >> >>>>> Default profile does not compile R artifact . R > >>>> profile > >>>> > > > > compiles R > >>>> > > > > >> >> >>>>> artifact. for convenience, it is enabled by > >>>> supplying -DR > >>>> > > to > >>>> > > > > mvn > >>>> > > > > >> >> >>>>> command line, e.g. > >>>> > > > > >> >> >>>>> > >>>> > > > > >> >> >>>>> mvn install -DR > >>>> > > > > >> >> >>>>> > >>>> > > > > >> >> >>>>> there's also a helper that installs the snapshot > >>>> version > >>>> > of > >>>> > > > the > >>>> > > > > >> >> >>>>> package in the crunchR module. > >>>> > > > > >> >> >>>>> > >>>> > > > > >> >> >>>>> There's RJava and JRI java dependencies which i > did > >>>> not > >>>> > > find > >>>> > > > > >> anywhere > >>>> > > > > >> >> >>>>> in public maven repos; so it is installed into my > >>>> github > >>>> > > > maven > >>>> > > > > >> repo > >>>> > > > > >> >> so > >>>> > > > > >> >> >>>>> far. Should compile for 3rd party. > >>>> > > > > >> >> >>>>> > >>>> > > > > >> >> >>>>> -DR compilation requires R, RJava and optionally, > >>>> > > RProtoBuf. > >>>> > > > R > >>>> > > > > Doc > >>>> > > > > >> >> >>>>> compilation requires roxygen2 (i think). > >>>> > > > > >> >> >>>>> > >>>> > > > > >> >> >>>>> For some reason RProtoBuf fails to import into > >>>> another > >>>> > > > package, > >>>> > > > > >> got a > >>>> > > > > >> >> >>>>> weird exception when i put @import RProtoBuf into > >>>> > crunchR, > >>>> > > so > >>>> > > > > >> >> >>>>> RProtoBuf is now in "Suggests" category. Down the > >>>> road > >>>> > that > >>>> > > > may > >>>> > > > > >> be a > >>>> > > > > >> >> >>>>> problem though... > >>>> > > > > >> >> >>>>> > >>>> > > > > >> >> >>>>> other than the template, not much else has been > done > >>>> so > >>>> > > > far... > >>>> > > > > >> >> finding > >>>> > > > > >> >> >>>>> hadoop libraries and adding it to the package > path on > >>>> > > > > >> initialization > >>>> > > > > >> >> >>>>> via "hadoop classpath"... adding Crunch jars and > its > >>>> > > > > >> non-"provided" > >>>> > > > > >> >> >>>>> transitives to the crunchR's java part... > >>>> > > > > >> >> >>>>> > >>>> > > > > >> >> >>>>> No legal stuff... > >>>> > > > > >> >> >>>>> > >>>> > > > > >> >> >>>>> No readmes... complete stealth at this point. > >>>> > > > > >> >> >>>>> > >>>> > > > > >> >> >>>>> On Thu, Oct 18, 2012 at 12:35 PM, Dmitriy > Lyubimov < > >>>> > > > > >> >> [email protected]> > >>>> > > > > >> >> >>>>> wrote: > >>>> > > > > >> >> >>>>> > Ok, cool. I will try to roll project template by > >>>> some > >>>> > > time > >>>> > > > > next > >>>> > > > > >> >> week. > >>>> > > > > >> >> >>>>> > we can start with prototyping and benchmarking > >>>> > something > >>>> > > > > really > >>>> > > > > >> >> >>>>> > simple, such as parallelDo(). > >>>> > > > > >> >> >>>>> > > >>>> > > > > >> >> >>>>> > My interim goal is to perhaps take some more or > >>>> less > >>>> > > simple > >>>> > > > > >> >> algorithm > >>>> > > > > >> >> >>>>> > from Mahout and demonstrate it can be solved > with > >>>> > Rcrunch > >>>> > > > (or > >>>> > > > > >> >> whatever > >>>> > > > > >> >> >>>>> > name it has to be) in a comparable time > >>>> (performance) > >>>> > but > >>>> > > > > with > >>>> > > > > >> much > >>>> > > > > >> >> >>>>> > fewer lines of code. (say one of factorization > or > >>>> > > > clustering > >>>> > > > > >> >> things) > >>>> > > > > >> >> >>>>> > > >>>> > > > > >> >> >>>>> > > >>>> > > > > >> >> >>>>> > On Wed, Oct 17, 2012 at 10:24 PM, Rahul < > >>>> > > [email protected] > >>>> > > > > > >>>> > > > > >> wrote: > >>>> > > > > >> >> >>>>> >> I am not much of R user but I am interested to > >>>> see how > >>>> > > > well > >>>> > > > > we > >>>> > > > > >> can > >>>> > > > > >> >> >>>>> integrate > >>>> > > > > >> >> >>>>> >> the two. I would be happy to help. > >>>> > > > > >> >> >>>>> >> > >>>> > > > > >> >> >>>>> >> regards, > >>>> > > > > >> >> >>>>> >> Rahul > >>>> > > > > >> >> >>>>> >> > >>>> > > > > >> >> >>>>> >> On 18-10-2012 04:04, Josh Wills wrote: > >>>> > > > > >> >> >>>>> >>> > >>>> > > > > >> >> >>>>> >>> On Wed, Oct 17, 2012 at 3:07 PM, Dmitriy > >>>> Lyubimov < > >>>> > > > > >> >> [email protected]> > >>>> > > > > >> >> >>>>> >>> wrote: > >>>> > > > > >> >> >>>>> >>>> > >>>> > > > > >> >> >>>>> >>>> Yep, ok. > >>>> > > > > >> >> >>>>> >>>> > >>>> > > > > >> >> >>>>> >>>> I imagine it has to be an R module so I can > set > >>>> up a > >>>> > > > maven > >>>> > > > > >> >> project > >>>> > > > > >> >> >>>>> >>>> with java/R code tree (I have been doing > that a > >>>> lot > >>>> > > > > lately). > >>>> > > > > >> Or > >>>> > > > > >> >> if you > >>>> > > > > >> >> >>>>> >>>> have a template to look at, it would be > useful i > >>>> > guess > >>>> > > > > too. > >>>> > > > > >> >> >>>>> >>> > >>>> > > > > >> >> >>>>> >>> No, please go right ahead. > >>>> > > > > >> >> >>>>> >>> > >>>> > > > > >> >> >>>>> >>>> > >>>> > > > > >> >> >>>>> >>>> On Wed, Oct 17, 2012 at 3:02 PM, Josh Wills < > >>>> > > > > >> >> [email protected]> > >>>> > > > > >> >> >>>>> wrote: > >>>> > > > > >> >> >>>>> >>>>> > >>>> > > > > >> >> >>>>> >>>>> I'd like it to be separate at first, but I > am > >>>> happy > >>>> > > to > >>>> > > > > help. > >>>> > > > > >> >> Github > >>>> > > > > >> >> >>>>> >>>>> repo? > >>>> > > > > >> >> >>>>> >>>>> On Oct 17, 2012 2:57 PM, "Dmitriy Lyubimov" > < > >>>> > > > > >> [email protected] > >>>> > > > > >> >> > > >>>> > > > > >> >> >>>>> wrote: > >>>> > > > > >> >> >>>>> >>>>> > >>>> > > > > >> >> >>>>> >>>>>> Ok maybe there's a benefit to try a > JRI/RJava > >>>> > > > prototype > >>>> > > > > on > >>>> > > > > >> >> top of > >>>> > > > > >> >> >>>>> >>>>>> Crunch for something simple. This should > both > >>>> save > >>>> > > > time > >>>> > > > > and > >>>> > > > > >> >> prove or > >>>> > > > > >> >> >>>>> >>>>>> disprove if Crunch via RJava integration is > >>>> > viable. > >>>> > > > > >> >> >>>>> >>>>>> > >>>> > > > > >> >> >>>>> >>>>>> On my part i can try to do it within Crunch > >>>> > > framework > >>>> > > > > or we > >>>> > > > > >> >> can keep > >>>> > > > > >> >> >>>>> >>>>>> it completely separate. > >>>> > > > > >> >> >>>>> >>>>>> > >>>> > > > > >> >> >>>>> >>>>>> -d > >>>> > > > > >> >> >>>>> >>>>>> > >>>> > > > > >> >> >>>>> >>>>>> On Wed, Oct 17, 2012 at 2:08 PM, Josh > Wills < > >>>> > > > > >> >> [email protected]> > >>>> > > > > >> >> >>>>> >>>>>> wrote: > >>>> > > > > >> >> >>>>> >>>>>>> > >>>> > > > > >> >> >>>>> >>>>>>> I am an avid R user and would be into it-- > >>>> who > >>>> > gave > >>>> > > > the > >>>> > > > > >> >> talk? Was > >>>> > > > > >> >> >>>>> it > >>>> > > > > >> >> >>>>> >>>>>>> Murray Stokely? > >>>> > > > > >> >> >>>>> >>>>>>> > >>>> > > > > >> >> >>>>> >>>>>>> On Wed, Oct 17, 2012 at 2:05 PM, Dmitriy > >>>> > Lyubimov < > >>>> > > > > >> >> >>>>> [email protected]> > >>>> > > > > >> >> >>>>> >>>>>> > >>>> > > > > >> >> >>>>> >>>>>> wrote: > >>>> > > > > >> >> >>>>> >>>>>>>> > >>>> > > > > >> >> >>>>> >>>>>>>> Hello, > >>>> > > > > >> >> >>>>> >>>>>>>> > >>>> > > > > >> >> >>>>> >>>>>>>> I was pretty excited to learn of Google's > >>>> > > experience > >>>> > > > > of R > >>>> > > > > >> >> mapping > >>>> > > > > >> >> >>>>> of > >>>> > > > > >> >> >>>>> >>>>>>>> flume java on one of recent BARUGs. I > think > >>>> a > >>>> > lot > >>>> > > of > >>>> > > > > >> >> applications > >>>> > > > > >> >> >>>>> >>>>>>>> similar to what we do in Mahout could be > >>>> > > prototyped > >>>> > > > > using > >>>> > > > > >> >> flume R. > >>>> > > > > >> >> >>>>> >>>>>>>> > >>>> > > > > >> >> >>>>> >>>>>>>> I did not quite get the details of Google > >>>> > > > > implementation > >>>> > > > > >> of > >>>> > > > > >> >> R > >>>> > > > > >> >> >>>>> >>>>>>>> mapping, > >>>> > > > > >> >> >>>>> >>>>>>>> but i am not sure if just a direct > mapping > >>>> from > >>>> > R > >>>> > > to > >>>> > > > > >> Crunch > >>>> > > > > >> >> would > >>>> > > > > >> >> >>>>> be > >>>> > > > > >> >> >>>>> >>>>>>>> sufficient (and, for most part, > efficient). > >>>> > > > RJava/JRI > >>>> > > > > and > >>>> > > > > >> >> jni > >>>> > > > > >> >> >>>>> seem to > >>>> > > > > >> >> >>>>> >>>>>>>> be a pretty terrible performer to do that > >>>> > > directly. > >>>> > > > > >> >> >>>>> >>>>>>>> > >>>> > > > > >> >> >>>>> >>>>>>>> > >>>> > > > > >> >> >>>>> >>>>>>>> on top of it, I am thinknig if this > project > >>>> > could > >>>> > > > > have a > >>>> > > > > >> >> >>>>> contributed > >>>> > > > > >> >> >>>>> >>>>>>>> adapter to Mahout's distributed matrices, > >>>> that > >>>> > > would > >>>> > > > > be > >>>> > > > > >> >> just a > >>>> > > > > >> >> >>>>> very > >>>> > > > > >> >> >>>>> >>>>>>>> good synergy. > >>>> > > > > >> >> >>>>> >>>>>>>> > >>>> > > > > >> >> >>>>> >>>>>>>> Is there anyone interested in > >>>> > > contributing/advising > >>>> > > > > for > >>>> > > > > >> open > >>>> > > > > >> >> >>>>> source > >>>> > > > > >> >> >>>>> >>>>>>>> version of flume R support? Just gauging > >>>> > interest, > >>>> > > > > Crunch > >>>> > > > > >> >> list > >>>> > > > > >> >> >>>>> seems > >>>> > > > > >> >> >>>>> >>>>>>>> like a natural place to poke. > >>>> > > > > >> >> >>>>> >>>>>>>> > >>>> > > > > >> >> >>>>> >>>>>>>> Thanks . > >>>> > > > > >> >> >>>>> >>>>>>>> > >>>> > > > > >> >> >>>>> >>>>>>>> -Dmitriy > >>>> > > > > >> >> >>>>> >>>>>>> > >>>> > > > > >> >> >>>>> >>>>>>> > >>>> > > > > >> >> >>>>> >>>>>>> > >>>> > > > > >> >> >>>>> >>>>>>> -- > >>>> > > > > >> >> >>>>> >>>>>>> Director of Data Science > >>>> > > > > >> >> >>>>> >>>>>>> Cloudera > >>>> > > > > >> >> >>>>> >>>>>>> Twitter: @josh_wills > >>>> > > > > >> >> >>>>> >>> > >>>> > > > > >> >> >>>>> >>> > >>>> > > > > >> >> >>>>> >>> > >>>> > > > > >> >> >>>>> >> > >>>> > > > > >> >> >>>>> > >>>> > > > > >> >> >>>> > >>>> > > > > >> >> >>>> > >>>> > > > > >> >> >>>> > >>>> > > > > >> >> >>>> -- > >>>> > > > > >> >> >>>> Director of Data Science > >>>> > > > > >> >> >>>> Cloudera <http://www.cloudera.com> > >>>> > > > > >> >> >>>> Twitter: @josh_wills < > http://twitter.com/josh_wills> > >>>> > > > > >> >> > >>>> > > > > >> > > >>>> > > > > >> > > >>>> > > > > >> > > >>>> > > > > >> > -- > >>>> > > > > >> > Director of Data Science > >>>> > > > > >> > Cloudera <http://www.cloudera.com> > >>>> > > > > >> > Twitter: @josh_wills <http://twitter.com/josh_wills> > >>>> > > > > >> > >>>> > > > > > > >>>> > > > > > > >>>> > > > > > > >>>> > > > > > -- > >>>> > > > > > Director of Data Science > >>>> > > > > > Cloudera <http://www.cloudera.com> > >>>> > > > > > Twitter: @josh_wills <http://twitter.com/josh_wills> > >>>> > > > > > >>>> > > > > >>>> > > > >>>> > > >>>> > >>>> > >>>> > >>>> -- > >>>> Director of Data Science > >>>> Cloudera <http://www.cloudera.com> > >>>> Twitter: @josh_wills <http://twitter.com/josh_wills> > >>>> > >>> > >>> > >> > > > > -- > Director of Data Science > Cloudera > Twitter: @josh_wills >
