Re: connecting hiveserver2 through ssh tunnel - time out

2014-08-25 Thread Kadir Sert
Hi,

could you please try,

ssh -o ServerAliveInterval=10 -L 10004:localhost:10004 murat@10.0.0.100



2014-08-25 20:52 GMT+03:00 murat migdisoglu :
> Hello,
>
> Due to some firewall restrictions, I need to connect from tableau to the
> hiveserver2 through ssh tunnel..
>
> I tried tunneling from port range 1 -10004 but tableau still times out..
>
> my hiveserver2 is running on 10.0.0.100 and I'm on 192. network.
>
> I tried
> ssh murat@10.0.0.100 -L 1:localhost:1 and configured tableau to
> connect to localhost:1 with no success..
>
>  What am I missing?
>
> Thanks


namenod shutdown. epoch number mismatch

2014-08-25 Thread cho ju il
hadoop version 2.4.1
Namenode shutdown to become epoch number mismatch. 
 
Why suddenly epoch numbers mismatch ?  
Why suddenly namenode shutdown ? 
 
 
*** namenode log 
 
2014-08-26 12:17:48,625 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
Rescanning after 3 milliseconds
2014-08-26 12:17:48,646 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 
0 directive(s) and 0 block(s) in 21 millisecond(s).
2014-08-26 12:18:03,599 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
192.10.1.209
2014-08-26 12:18:03,599 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Rolling edit logs
2014-08-26 12:18:03,599 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Ending log segment 22795096
2014-08-26 12:18:03,633 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Number of transactions: 81 Total time for transactions(ms): 10 Number of 
transactions batched in Syncs: 0 Number of syncs: 14 SyncTimes(ms): 159 197
2014-08-26 12:18:03,675 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file /data/dfs/name/current/edits_inprogress_00022795096 -> 
/data/dfs/name/current/edits_00022795096-00022795176
2014-08-26 12:18:03,675 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Starting log segment at 22795177
2014-08-26 12:18:05,419 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap 
updated: 192.10.1.211:40010 is added to blk_1076515119_2774480 size 46226115
2014-08-26 12:18:05,420 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: BLOCK* 
processOverReplicatedBlock: Postponing processing of over-replicated 
blk_1076515119_2774480 since storage + 
[DISK]DS-405679644-192.10.1.201-40010-1401163823536:NORMALdatanode 
192.10.1.201:40010 does not yet have up-to-date block information.
2014-08-26 12:18:18,625 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
Rescanning after 3 milliseconds
2014-08-26 12:18:18,642 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 
0 directive(s) and 0 block(s) in 17 millisecond(s).
2014-08-26 12:18:48,625 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
Rescanning after 3 milliseconds
2014-08-26 12:18:48,642 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 
0 directive(s) and 0 block(s) in 17 millisecond(s).
2014-08-26 12:19:18,625 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
Rescanning after 3 milliseconds
2014-08-26 12:19:18,642 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 
0 directive(s) and 0 block(s) in 17 millisecond(s).
2014-08-26 12:19:48,625 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
Rescanning after 3 milliseconds
2014-08-26 12:19:48,642 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 
0 directive(s) and 0 block(s) in 17 millisecond(s).
2014-08-26 12:20:04,506 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
192.10.1.209
2014-08-26 12:20:04,506 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Rolling edit logs
2014-08-26 12:20:04,506 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Ending log segment 22795177
2014-08-26 12:20:04,507 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Number of transactions: 53 Total time for transactions(ms): 6 Number of 
transactions batched in Syncs: 0 Number of syncs: 1 SyncTimes(ms): 20 67
2014-08-26 12:20:04,544 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Number of transactions: 53 Total time for transactions(ms): 6 Number of 
transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 48 76
2014-08-26 12:20:04,587 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file /data/dfs/name/current/edits_inprogress_00022795177 -> 
/data/dfs/name/current/edits_00022795177-00022795229
2014-08-26 12:20:04,587 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Starting log segment at 22795230
2014-08-26 12:20:18,625 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
Rescanning after 3 milliseconds
2014-08-26 12:20:21,771 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started 
for active state
2014-08-26 12:20:59,008 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Ending log segment 22795230
2014-08-26 12:20:59,009 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: NameNodeEditLogRoller was 
interrupted, exiting
2014-08-26 12:20:59,014 WARN 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Remote journal 
192.10.1.209:8485 failed to write txns 22795231-22795235. Will try to write to 
this JN again after the next log roll.
org.apache.hadoop.ipc.RemoteException(java.io.IOExcept

what do you call it when you use Tez?

2014-08-25 Thread Adaryl "Bob" Wakefield, MBA
You've got MapReduce jobs right? What is it called if, instead, you're using 
Tez? A Tez job?




Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData 



Re: Missing Snapshots for 2.5.0

2014-08-25 Thread Tsuyoshi OZAWA
Hi Mark,

Thanks for your reporting. I also confirmed that we cannot access jars
of Hadoop 2.5.0.

Karthik, could you check this problem?

Thanks,
- Tsuyoshi

On Thu, Aug 21, 2014 at 2:08 AM, Campbell, Mark  wrote:
> It seems that all the needed archives (yard, mapreduce, etc) are missing the
> 2.5.0 build folders.
>
>
>
> My Hadoop 2.5.0 fails at the final build because none of the dependences can
> be found.
>
>
>
> Version 2.6.0 does seem to be in the list, however no binaries are available
> that I can see.
>
>
>
> Please advise.
>
>
> Cheers,
> Mark
>
>
>
>
>
> Path /org/apache/hadoop/hadoop-mapreduce-client-app/2.5.0-SNAPSHOT/ not
> found in local storage of repository "Snapshots" [id=snapshots]
>
>
>
> Downloading:
> http://repository.jboss.org/nexus/content/groups/public/org/apache/hadoop/hadoop-mapreduce-client-app/2.5.0/hadoop-mapreduce-client-app-2.5.0.pom
>
> Downloading:
> http://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-app/2.5.0/hadoop-mapreduce-client-app-2.5.0.pom
>
> [WARNING] The POM for
> org.apache.hadoop:hadoop-mapreduce-client-app:jar:2.5.0 is missing, no
> dependency information available
>
> Downloading:
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-yarn-api/2.5.0/hadoop-yarn-api-2.5.0.pom
>
> Downloading:
> http://repository.jboss.org/nexus/content/groups/public/org/apache/hadoop/hadoop-yarn-api/2.5.0/hadoop-yarn-api-2.5.0.pom
>
> Downloading:
> http://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-yarn-api/2.5.0/hadoop-yarn-api-2.5.0.pom
>
> [WARNING] The POM for org.apache.hadoop:hadoop-yarn-api:jar:2.5.0 is
> missing, no dependency information available
>
> Downloading:
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/2.5.0/hadoop-common-2.5.0.jar
>
> Downloading:
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-mapreduce-client-app/2.5.0/hadoop-mapreduce-client-app-2.5.0.jar



-- 
- Tsuyoshi


Local file system to access hdfs blocks

2014-08-25 Thread Demai Ni
Hi, folks,

New in this area. Hopefully to get a couple pointers. 

I am using Centos and have Hadoop set up using cdh5.1(Hadoop 2.3)

I am wondering whether there is a interface to get each hdfs block information 
in the term of local file system. 

For example, I can use "Hadoop fsck /tmp/test.txt -files -blocks -racks" to get 
blockID and its replica on the nodes, such as: repl =3[ /rack/hdfs01, 
/rack/hdfs02...]

 With such info, is there a way to 
1) login to hfds01, and read the block directly at local file system level?


Thanks

Demai on the run

connecting hiveserver2 through ssh tunnel - time out

2014-08-25 Thread murat migdisoglu
Hello,

Due to some firewall restrictions, I need to connect from tableau to the
hiveserver2 through ssh tunnel..

I tried tunneling from port range 1 -10004 but tableau still times
out..

my hiveserver2 is running on 10.0.0.100 and I'm on 192. network.

I tried
ssh murat@10.0.0.100 -L 1:localhost:1 and configured tableau to
connect to localhost:1 with no success..

 What am I missing?

Thanks


fpcalc on Hadoop streaming can't find file

2014-08-25 Thread Edmund Day
I have an hdfs directory that contains audio files. I wish to run fpcalc on 
each file using Hadoop streaming. I can do this locally no problem, but in 
hadoop fpcalc cannot see the files.
My code is:

    import shlex
    cli = './fpcalc -raw -length ' + str(sample_length) + ' ' + file_a
    from subprocess import Popen, PIPE
  
    cli_parts = shlex.split(cli)
    fpcalc_cli = Popen(cli_parts, stdin=PIPE, stderr=PIPE, stdout=PIPE)
    fpcalc_out,fpcalc_err=fpcalc_cli.communicate()
cli_parts is: ['./fpcalc', '-raw', '-length', '30', 
'/user/hduser/audio/input/flacOriginal1.flac'] and runs fine locally.

fpcalc_err is:

    ERROR: couldn't open the file
    ERROR: unable to calculate fingerprint for file 
/user/hduser/audio/input/flacOriginal1.flac, skipping

the file DOES exist:

    hadoop fs -ls /user/hduser/audio/input/flacOriginal1.flac
    Found 1 items
    -rw-r--r--   1 hduser supergroup    2710019 2014-08-08 11:49 
/user/hduser/audio/input/flacOriginal1.flac


Can I point to a file like this in Hadoop streaming?

TIA 




Read how Aylesbury and the Earth were created, here:
http://edday.co.uk