Hi,
I hope I understood your question correct - did you describe your table?
Like
create TABLE YOURTABLE (row1 STRING, row2 STRING, row3 STRING) ROW FORMAT
DELIMITED FIELDS TERMINATED BY 'YOUR TERMINATOR' STORED AS TEXTFILE;
row* = a name of your descision, Datatype look @documentation.
After
Hi,
Thanks for the response.
Yes, You got my question.
An example of my log message line will be as below:
[2011-10-17 16:30:57,281] [ INFO] [33157362@qtp-28456974-0]
[net.hp.tr.webservice.referenceimplcustomer.resource.CustomersResource]
[Organization: Travelocity] [Client: AA] [Location
Hi All,
My setup is
hadoop-0.20.203.0
hive-0.7.1
I am having a total of 5 node cluster: 4 data nodes, 1 namenode (it is
also acting as secondary name node). On namenode I have setup hive with
HiveDerbyServerMode to support multiple hive server connection.
I have inserted plain text
Hi Mark,
Thanks for your response. I tried skew optimization and I also saw the
video by Lin and Namit. From what I understand about skew join, instead of
a single go , they divide it into 2 stages.
Stage1
Join non-skew pairs. and write the skew pairs into temporary files on HDFS.
Stage 2
Do a
Hi,
In your case total file size isn't main factor that reduces performance,
number of files is.
To test this try merging those over 2000 files into one (or few) big,
then upload it to HDFS and test hive performance (it should be
definitely higher). It this works you should think about
Hi Paul,
I am having the same problem. Do you know any efficient way of merging the
files?
-Mohit
On Tue, Dec 6, 2011 at 8:14 PM, Paul Mackles pmack...@adobe.com wrote:
How much time is it spending in the map/reduce phases, respectively? The
large number of files could be creating a lot of
Hi,
I opened the web console for Hive using http://localhost:/hwi
In the Browse Schema option, I could see only the default Hive table list name
and description.
Not able to view the tables. What should be issue?
I have created 2 tables under default schema , but could not able to see
I get this error message in the console..
11/12/06 08:14:50 INFO DataNucleus.MetaData: Registering listener for metadata
initialisation
11/12/06 08:14:50 INFO metastore.ObjectStore: Initialized ObjectStore
11/12/06 08:14:50 WARN DataNucleus.MetaData: MetaData Parser encountered an
error in
Hi Sangeetha,
Hive uses SerDe (Serializer/Deserializer) for reading data from and writing to
HDFS. You have many options for choosing the SerDe for your table.
For example, if your file contains tab delimited fields, you could use the
default SerDe (by not specifying any SerDe) and specify the
Hi Sangeetha,
sry, was on road and the answer tooks a while.
As Mark wrote, SerDe will be a good start. If its usefull for you take a
look at http://code.google.com/p/hive-json-serde/wiki/GettingStarted.
- alex
On Tue, Dec 6, 2011 at 10:26 AM, sangeetha k get2sa...@yahoo.com wrote:
Hi,
Can you try from B join A.
One simple rule of join in Hive is Largest table last. The smaller tables
can then be buffered into distributed cache for fast retrieval and
comparison.
Thanks
Aaron
On Tue, Dec 6, 2011 at 4:01 AM, john smith js1987.sm...@gmail.com wrote:
Hi Mark,
Thanks for your
Hi,
I am trying to understand the output of hive Explain command. I found the
documentation provided (
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain )
to be of little help. Is there any other place where I can find the
detailed documentation on this?
Hiroyuki, were you
Pig has a Log loader in Piggybank. You can use that to generate the columns
of that table and make the table point to it.
Take a look--
https://github.com/apache/pig/tree/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/apachelog
Thanks,
Aniket
On Tue, Dec 6, 2011 at
How about a simple Pig script with a load and a store statement? Set the max #
reducers to say 20 or 30, that way you will only have 20-30 files as output.
Then put these files in the Hive dir. Make sure to match the delimiters in Hive
Pig.
-Ayon
See My Photos on Flickr
Also check out my
14 matches
Mail list logo