Hi Folks,
i Have some question like
in my tmp folder on hdfs i have a user called hadoop whose having much
space, i need to free this space so please help me out.
*
/tmp/hive-hadoop/hive_2012-03-13_04-32-14_701_8751021191431391338/-ext-10002/
*
Is it part of data stored in hdfs or its just a
to my knowledge this is all temporary data.
All the data related to your tables is stored on the location which you can
get with desc formatted table_name
this is a temporary hive storage place. If you kill the job in between,
this data in /tmp/ is left as stale data
On Tue, Mar 13, 2012 at
Hello Weidong Bian
Did you see the following configuration properties in conf directory
property
namemapred.reduce.tasks/name
value-1/value
descriptionThe default number of reduce tasks per job. Typically set
to a prime close to the number of available hosts. Ignored when
Hi Keith,
We generally store date columns as a string in a similar format to ISO 8601
(-mm-dd hh:MM:ss). This way, when we put the date column in the ORDER BY
clause, it will be sorted chronologically. It also saves us the trouble of
whipping out a unix timestamp calculator to figure out
Yes,it's in my hive-default.xml and Hive figured to use one reducer only,
so I thought increase it to 5 might help,which doesn't.
Anyway, to scan the largest table 6 times isn't efficient hence my question.
On Wed, Mar 14, 2012 at 12:37 AM, Jagat jagatsi...@gmail.com wrote:
Hello Weidong Bian
Is see, you store the date-time as a lexicographically sortable string. That's
fine, but I'm operating on existing csv tables. I guess I could whip up a
hadoop job to convert all the date-time columns to lexicographic strings and
then wrap hive around the resulting converted tables. I was
Hi Keith,
Do you know exactly how an algorithm should be in order to fit in the
MapReduce framework? Could you refer me to some references?
Thanks and Regards,
Mahsa
On Tue, Mar 13, 2012 at 12:49 PM, Keith Wiley kwi...@keithwiley.com wrote:
Wrapping hive around existing csv files consists of manually naming and typing
every column during the creation command. I have several csv tables and some
of them have a ton of columns. I would love a way to create hive tables which
automatically infers the column types by attempting various
For theta joins, you'll have to convert the query to an equi-join, and then
filter for non-equality in the WHERE clause. Depending upon the size of each
table, you might consider looking at map-side joins, which will allow for doing
non-equality filters during a join before it's passed to the
Do the joins share the same key?
2012/3/13 Bruce Bian weidong@gmail.com
Yes,it's in my hive-default.xml and Hive figured to use one reducer only,
so I thought increase it to 5 might help,which doesn't.
Anyway, to scan the largest table 6 times isn't efficient hence my
question.
On
Hi Keith,
You should also consider writing you own UDF that takes in the date in
American format and spits out a lexicographical string.
That way you don't have to modify your base data, just use this newly created
from_american_date(String date) UDF to get your new date string in
Mark Grover,
You could do something like that. However you can structure the table as:
CREATE TABLE X ( MapString,String stuff)
CREATE TABLE X ( ListString stuff)
You can then define a viww over these structures that allow you to
cherry pick the fields you want.
Edward
On Tue, Mar 13, 2012 at 1:03 PM, Keith
If you don't want to modify your CSV files, I would suggest doing the
conversion as part of the query. For that, you can either include the
conversion in each query, or you can create a view of your table that includes
a column with the converted date.
Either way, you may want to try
Um, this is weird. It simply isn't modifying the order of the returned rows at
all. I get the same result with no 'order by' clause as with one. Adding a
limit or specifying 'asc' has no effect. Using 'sort by' also has no effect.
The column used for ordering is type INT. In the example
You have attributevalue in quotes which makes it a constant literal.
igor
decide.com
On Tue, Mar 13, 2012 at 1:54 PM, Keith Wiley kwi...@keithwiley.com wrote:
Um, this is weird. It simply isn't modifying the order of the returned
rows at all. I get the same result with no 'order by' clause
This syntax is wrong for both hive and SQL:
hive select * from stringmap where attributename='foo' order by
'attributevalue';
This is right.
hive select * from stringmap where attributename='foo' order by attributevalue;
On Tue, Mar 13, 2012 at 4:54 PM, Keith Wiley kwi...@keithwiley.com wrote:
On Mar 13, 2012, at 13:57 , Igor Tatarinov wrote:
You have attributevalue in quotes which makes it a constant literal.
igor
decide.com
Argh! You are correct good sir!
thanks
Keith Wiley
What string values in a csv field are parsable by Hive as booleans? If I
indicate that a column is of type boolean when wrapping an external table
around a csv file, what are the legal values? I can imagine numerous
possibilities, for example (for the true values):
0
t
T
true
True
TRUE
y
Y
I obviously intended '1', not '0' as an example of a true value.
Keith Wiley kwi...@keithwiley.com keithwiley.commusic.keithwiley.com
The easy confidence with which I know another man's religion is folly
19 matches
Mail list logo