RE: line breaks
Don't use \n to terminate lines if your fields contain them. Use something else
like \002 or \001
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\002';
...or something like that.
Travis Powell / tpow...@tealeaf.com
Tealeaf Technology / http
are almost every query will have some time-related component in it (and it
spreads out the data among partitions fairly well.)
Let me know if this works for you. We start every job with those first few
lines of Hive script. It works well for us.
Thanks,
Travis Powell
-Original Message
Have you checked your logs? These are often the best places to start.
Look at the running job and click on the running count, the current
task, then the task logs.
Sometimes they're helpful, sometimes they're not.
http://hadoop-master:50030/jobtracker.jsp
Travis Powell / tpow
.
Thanks,
Travis Powell
Travis Powell / tpow...@tealeaf.com
Can I partition by an existing field?
I have a 10 GB file with a date field and an hour of day field. Can I
load this file into a table, then insert-overwrite into another
partitioned table that uses those fields as a partition? Would something
like the following work?
INSERT OVERWRITE
Comments cannot be the first line of a file, if I recall correctly. The way
that I put my name and email at the top is after a using statement for the
database.
use default;
-- John Doe, j...@acme.ru
-- (c) Acme co 2007
Select * from (...)
Travis
Travis Powell
tpow...@tealeaf.com
(415
? Or could I use a python script TRANSFORM()?
I'm aware of, but not entirely up to editing, the collect_set file:
https://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoo
p/hive/ql/udf/generic/GenericUDAFCollectSet.java
Thanks!
Travis Powell
Travis Powell / tpow
Don't bother trying to do Hive on Cygwin w/ Windows.
I tried a million different configurations and could never get it to
work.
I'd recommend downloading the Cloudera VM and the free VMWare player
instead for development work, since those are tried-and-true work
environments.
Travis Powell