We have a few dozen files that need to be made available to all
mappers/reducers in the cluster while running hive transformation steps .
It seems the add archive does not make the entries unarchived and thus
available directly on the default file path - and that is what we are
looking for.
To
what would be interesting would be to run a little experiment and find out
what the default PATH is on your data nodes. How much of a pain would it
be to run a little python script to print to stderr the value of the
environmental variable $PATH and $PWD (or the shell command 'pwd') ?
that's of
@Stephen: given the 'relative' path for hive is from a local downloads
directory on each local tasktracker in the cluster, it was my thought that
if the archive were actually being expanded then
somedir/somefileinthearchive should work. I will go ahead and test this
assumption.
In the
i personally only know of adding a .jar file via add archive but my
experience there is very limited. i believe if you 'add file' and the file
is a directory it'll recursively take everything underneath but i know of
nothing that inflates or un tars things on the remote end automatically.
i
thx for the tip on add file where file is directory. I will try that.
2013/6/20 Stephen Sprague sprag...@gmail.com
i personally only know of adding a .jar file via add archive but my
experience there is very limited. i believe if you 'add file' and the file
is a directory it'll recursively
yeah. the archive isn't unpacked on the remote side. I think add archive
is mostly used for finding java packages since CLASSPATH will reference the
archive (and as such there is no need to expand it.)
On Thu, Jun 20, 2013 at 9:00 AM, Stephen Boesch java...@gmail.com wrote:
thx for the tip on
Stephen: would you be willing to share an example of specifying a
directory as the add file target?I have not seen this working
I have attempted to use it as follows:
*We will access a script within the hivetry directory located here:*
hive ! ls -l
In the *Attempt two, *are you not supposed to use hivetry as the
directory?
May be you should try giving the full path
/opt/am/ver/1.0/hive/hivetry/classifier_wf.py and see if it works.
Regards,
Ramki.
On Thu, Jun 20, 2013 at 9:28 AM, Stephen Boesch java...@gmail.com wrote:
Stephen: would
Good eyes Ramki! thanks this directory in place of filename appears to
be working. The script is getting loaded now using the Attempt two i.e.
the hivetry/classification_wf.py as the script path.
thanks again.
stephenb
2013/6/20 Ramki Palle ramki.pa...@gmail.com
In the *Attempt two, *are