As far as reading the tar itself, since we're looking for an implementation
of o.a.h.FileSystem, a pointer to LocalFileSystem should work, though I've
not tried it.

In terms of loading extra files (akin to Hadoop's DistributedCache, I
imagine?), there's no direct support for this at the moment.  What I'd do
is have each task bring in the files (hosted for instance on a small local
http server?) during the initialization phase (obtained by implementing
InitableStreamTask).

For more direct support, we could consider adding something into the
configuration of resources to obtain during startup or, as part of another
idea I've been kicking around, we could introduce some user code to be run
on the AM before task startup that could grab these files and host them
locally.

-Jakob


On Fri, Mar 14, 2014 at 9:13 AM, Anh Thu Vu <[email protected]> wrote:

> Hmm, I think samza does support reading the tar from local FS too. If so, I
> just have the first question about allowing extra optional resources
>
> Casey
>
>
> On Fri, Mar 14, 2014 at 4:02 PM, Anh Thu Vu <[email protected]> wrote:
>
> > Hi guys,
> >
> > I have an extra (data) file that is required for my tasks in each
> > container. This file is created on-the-fly and not included in my
> > pre-created tar file. So I wonder if samza currently supports this
> > functionality (to let user specify the extra resources to include when
> > launching a task).
> >
> > If not, what do you guys think about having this with a a patch?
> >
> > Lastly, if I'm not wrong, samza currently only support either HTTP or
> HDFS
> > as the filesystem to read the tar file from. I think it would be nice to
> be
> > able to pass the tar file from the local filesystem of the launching
> node.
> > What do you think?
> >
> > Casey
> >
>

Reply via email to