Personally, I'd love to see Crunch mixed with Clojure. I was thinking about this myself, but I'd rather see someone who really knows Clojure take this on.
Just don't call it Clunch. -- Joe On Thu, Nov 15, 2012 at 5:04 AM, Victor Iacoban <[email protected]>wrote: > Thanks Josh, will give this a try > > > On Wed, Nov 14, 2012 at 9:54 PM, Josh Wills <[email protected]> wrote: > > > I'm always glad to help people to extend Crunch in ways that are useful > for > > them. I think that most things that involve type-related extensions can > be > > handled using the PTypes.derived() function, which can be used to create > > custom PTypes that are mapped to underlying serialized types, so that you > > could do something like > > > > // Forgive my syntax errors, I'm doing this w/o an IDE > > PType<Object> objectType = PTypes.derived(Object.class, new > > InputMapFn<BytesWritable, Object>(), new OutputMapFn<Object, > > BytesWritable>(), Writables.writables(BytesWritable.class)); > > > > ...which is essentially how Scrunch works: the PTypes { } functionality > in > > Scrunch maps from Scala types to Java types using the derived > > functionality. > > > > The Converter stuff is internal to Avro and Writable, I can't think of a > > case where that would need to be exposed outside the package (i.e., once > > you've decided on whether to use Writables or Avro as your serialization > > framework, the choice of Converter is fixed.) > > > > If you have a use case where the derived type can't handle the conversion > > or is a poor choice for whatever reason, I'm all about having a > discussion > > and trying out different designs. > > > > Josh > > > > > > On Wed, Nov 14, 2012 at 6:18 PM, Victor Iacoban < > [email protected] > > >wrote: > > > > > Hi, > > > > > > I'm very interested in writing a wrapper library around Apache Crunch > for > > > Clojure, something similar to existing Scrunch. > > > How do you recommend to start? > > > > > > I was looking through Crunch code and it looks like I can pretty easily > > > integrate it in clojure by adding some custom WritableType type. > > > Something like WritableType<Object, ByteWritable> with a custom > converter > > > or inputFn/outputFn functions. > > > > > > Regretfully there are several issues with this approach and instead I'd > > > have to duplicate all those type classes for a new type set > > > * WritableType has a package visible constructor so I cannot extend it > > and > > > cannot instantiate it > > > * Converter is instantiated inside WritableType constructor so in case > I > > > need a different converter I'm stuck > > > * Writables has a factory method for WritableType but it's private > > > * it looks like there is an attempt to support additional WritableTypes > > > through EXTENSIONS in Writables but it would only work for cases where > in > > > WritableType<T, W> both T and W are hadoop writables > > > > > > So what do you think is a best solution, is it possible to open up the > > api > > > to support custom WritableTypes or the only option for me is to > > implement a > > > new ClojurePType and all related classes? > > > > > > Hope I'm not too detailed, but at this stage you all are probably very > > > familiar with the code > > > > > > Thanks, > > > Victor > > > > > >
