Re: DataMovementType impls

2014-07-25 Thread Hitesh Shah
DataMovementEvent is a construct defined for an Input/Output pair to communicate with each other. The actual information being passed between the 2 is not understood by the framework except in that, it is a byte payload to be handed off from the source to the destination. Users are not expected

Re:

2014-07-31 Thread Hitesh Shah
Hi This looks like a 3-vertex DAG. It could be possibly be a linear DAG such as Map1 -> Map2 -> Reduce3 or a Join DAG where Map1 -> Reduce3 and Map2 -> Reduce3. If you can get the application logs from YARN ( using bin/yarn logs -applicationId application_1404180111945_438880 ), you will be a

Re:

2014-07-31 Thread Hitesh Shah
se explain how the > number of tasks got reduced here ? > > Thanks. > > > On Thu, Jul 31, 2014 at 9:20 PM, Hitesh Shah wrote: > Hi > > This looks like a 3-vertex DAG. It could be possibly be a linear DAG such as > Map1 -> Map2 -> Reduce3 or a Join DAG where >

Re: ShuffledMergedInput not started

2014-08-07 Thread Hitesh Shah
However, the exception stack trace is misleading with respect to the actual underlying problem. @Thad, could you file a bug in that regard? thanks — Hitesh On Aug 7, 2014, at 1:27 PM, Bikas Saha wrote: > The processor is expected to call start() before using the input. That’s the > recommend

Re: Application Shutdown

2014-08-11 Thread Hitesh Shah
Hi Thad, To clarify, are you looking for an API in the client to invoke a callback when a session has shut down? Or invoke user logic in the AM when the session is about to shut down? For the former, given that client does not poll the session continuously, detecting the session shutdown is o

Re: Problems accessing tez on ec2

2014-08-12 Thread Hitesh Shah
It seems like the AM is binding to the external/public hostname and not the internal IP. Could you look for this log message in the AM logs: "Instantiated DAGClientRPCServer at”. This will provide some information as to what the AM is binding to. thanks — Hitesh On Aug 11, 2014, at 7:43 AM,

Re: Default value for TEZ_AM_LAUNCH_ENV and TEZ_TASK_LAUNCH

2014-08-13 Thread Hitesh Shah
Subroto, could you file a jira for this and mark it as a blocker for 0.5. This should ideally be addressed before then. As commented on TEZ-1127, it is a question as to what the default should be - whether HADOOP_COMMON_HOME or HADOOP_PREFIX and to some extent, it needs to handle Windows deploym

Re: run hive atop tez

2014-08-24 Thread Hitesh Shah
Hello Robert, Tez has gone through quite a few API changes since the 0.4.0 release. The changes were mainly to clean up and simplify the APIs to make it easier for users to write native Tez DAGs. This had the unfortunate side-effect of making 0.5 incompatible with Hive 0.13.x. For now, I bel

Re: container size

2014-09-09 Thread Hitesh Shah
Hi Robert, From a Tez point of view, a user of the Tez APIs can define the container size on a per vertex basis. I believe currently, Hive, when using Tez, uses a single size for all its vertices. thanks — Hitesh On Sep 9, 2014, at 6:37 PM, Grandl Robert wrote: > Hi guys, > > It seems the

Re: noob local resource question

2014-09-11 Thread Hitesh Shah
Hi Chris, Unlike MR and its support of distributed cache, Tez does not make any inferences into the structure of the LocalResources specified ( i.e structure of tarball, jar, etc ) and therefore expects the user to modify the class path as needed. It might be something worth considering as a

Re: Getting number of vertex tasks at runtime

2014-09-11 Thread Hitesh Shah
Hello Kostas, 0.5.0 is released and should be available on the apache releases maven repo. There should likely be a 0.5.1 release at some point in the near future with various bug fixes. As for 0.6.0-SNAPSHOT, we are adding features and bug fixes to this but it should be compatible with 0.5.0.

Re: ShuffleVertexManager looks only for input vertices with DataMovement being SCATTER_GATHER

2014-09-12 Thread Hitesh Shah
Hi Robert, This is a bug ( which has been seen in other scenarios too ). You can follow some of the discussion related to this issue at https://issues.apache.org/jira/browse/TEZ-1522. thanks — Hitesh On Sep 12, 2014, at 11:20 AM, Grandl Robert wrote: > Hi guys, > > During some of my expe

Re: 2 processes for the same container both not registered in YARN UI

2014-10-13 Thread Hitesh Shah
Hi Johannes, To clarify, are you concerned about the 2 processes with pids 26964 and 27046 ? If yes, the first one is the bash script invoked by the YARN NodeManager to launch the actual TezChild JVM. — HItesh On Oct 13, 2014, at 3:18 AM, Johannes Zillmann wrote: > Hey Bikas, > > this was

[ANNOUNCE] New Apache Tez Committer - Jeff Zhang

2014-10-16 Thread Hitesh Shah
Hi all, I am very pleased to announce that Jeff Zhang has been voted in as a committer in the Apache Tez project. We appreciate all the work Jeff has put into the project so far, and are looking forward to his future contributions. Welcome aboard, Jeff. Thanks, — Hitesh on behalf of the Apache

[DISCUSS] Exposing user payloads in the Tez UI

2014-10-21 Thread Hitesh Shah
Hi folks As part of the Tez APIs, every object ( Input / Output / Processor / EdgeManagerPlugin / VertexManagerPlugin ) can be associated with its own user provided payload to set itself up. The format of this payload is not known to Tez as it could be a java serialized object/protobuf/xml, et

Re: Questions about Tez under the hood

2014-10-29 Thread Hitesh Shah
Answers inline. — Hitesh On Oct 29, 2014, at 2:33 AM, Fabio wrote: > Thanks Bikas for your answer and suggestion, actually my work deals more with > high level modeling/behavior/performance of Tez, but there is another guy who > is goign to handle Tez sources, I will suggest him to contribute

Re: Question Tez under the hood

2014-11-03 Thread Hitesh Shah
Hi For the most part, each Hive CLI session or JDBC/ODBC connection to HiveServer2 would map to a single Application Master. HiveServer does have some optimizations though ( to avoid the overhead cost of launching a new AM ) where it tries to keep a pool of ApplicationMasters around and does s

Re: [VOTE] Release Apache Tez 0.5.2 rc0

2014-11-07 Thread Hitesh Shah
+1(binding) Verified sigs, checksums. Built from source and ran example jobs on a single node cluster. — Hitesh On Nov 4, 2014, at 6:42 PM, Bikas Saha wrote: > > Folks, > > I have created an Apache Tez 0.5.2 release candidate (rc0). > > This is an incremental release that contains many

Re: Tez TimelineServer

2014-11-11 Thread Hitesh Shah
Hi Dharmesh Would you mind filing a jira and attaching the application logs obtained by running “bin/yarn logs -applicationId ” Also, I am assuming you are setting “tez.am.log.level” to DEBUG to get debug logging? Also, when you query the timeline server, what are the calls that you are maki

Re: Tez TimelineServer

2014-11-12 Thread Hitesh Shah
I am only looking at the TLS web UI. Planning to do programmatic > > access if this runs. > > > > Thanks, > > Dharmesh > > > > On Tue, Nov 11, 2014 at 9:41 PM, Hitesh Shah wrote: > >> > >> Hi Dharmesh > >> > >> Would you m

Re: Tez TimelineServer

2014-11-13 Thread Hitesh Shah
.HttpConnection.handle(HttpConnection.java:404) > at > org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) > at > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) > > Nov 13, 2014 1:17:27 AM > com.sun.jerse

Re: SLS and Tez

2014-11-14 Thread Hitesh Shah
Hello Fabio We do not have a job trace file generated by Tez and therefore no simulator that can re-run the trace. We do store some historical data for the job but the level of tooling around it is pretty minimal. — Hitesh On Nov 14, 2014, at 3:29 AM, Fabio wrote: > With SLS (Yarn Schedule

Re: SLS and Tez

2014-11-14 Thread Hitesh Shah
> Fabio > > On 11/14/2014 07:54 PM, Hitesh Shah wrote: >> Hello Fabio >> >> We do not have a job trace file generated by Tez and therefore no simulator >> that can re-run the trace. We do store some historical data for the job but >> the level of tooling a

Re: SLS and Tez

2014-11-14 Thread Hitesh Shah
minutes job (not for me, for sure). > I'm not even sure it's a job it is worth doing, since I wasn't able to find > any information about the reliability of SLS as a Yarn simulator. Officially, > it is just used to test the scheduler behavior, and as far as I remember it >

Re: Native Compression Lib loading failed

2014-11-17 Thread Hitesh Shah
Hi Subroto, It could be an installer/distro issue. You may need to redirect your question to the Hortonworks forums to get an answer on this. But yes, you are right - the env needs to be setup correctly to load the required native libs. Out of curiosity, are you using Ambari to setup your cl

Re: Enabling Tez sessions on HiveServer2

2014-12-02 Thread Hitesh Shah
BCC’ed user@tez. This question belongs to either the hive user list or the Hortonworks user forums. thanks — Hitesh On Dec 2, 2014, at 1:28 PM, Pala M Muthaia wrote: > Hi, > > I am trying to get Tez sessions enabled with HS2. I start the HiveServer2 > instance with the flag "-hiveconf hive

Re: [VOTE] Release Apache Tez 0.5.3 rc0

2014-12-03 Thread Hitesh Shah
In light of https://issues.apache.org/jira/browse/TEZ-1818, it might be better to roll out a new RC with this fix? Thoughts? — Hitesh On Dec 2, 2014, at 1:08 PM, Bikas Saha wrote: > Folks, > > > > I have created an Apache Tez 0.5.3 release candidate (rc0). > > > > This is an incrementa

[DISCUSS] What versions of hadoop should we support?

2014-12-04 Thread Hitesh Shah
Hello folks Could folks who are following these mailing lists do a raise of hands on which versions of Hadoop you are trying to run Tez on? 1) 2.2.x 2) 2.3.x 3) 2.4.x 4) 2.5.x 5) 2.6.x Given that we are building out the UI and for the large part, the History UI ( i.e. the one which will disp

Re: Vertex/task priority and container request

2014-12-05 Thread Hitesh Shah
Hi Fabio, Regarding the second container assignment, the critical aspect is "reusedContainer=true”. It is re-using the container used for the parent vertex’s task hence the priority is not relevant. In such cases, eventually the priority 4 container will be released without being used. If you

Re: [DISCUSS] What versions of hadoop should we support?

2014-12-08 Thread Hitesh Shah
, Moore, Douglas wrote: > 2.4.x and beyond for the customers we have > > Sent from my iPhone > >> On Dec 4, 2014, at 12:36 PM, Hitesh Shah wrote: >> >> Hello folks >> >> Could folks who are following these mailing lists do a raise of hands on >> wh

Re: Containers lifespan in session mode

2014-12-09 Thread Hitesh Shah
We probably need to fix the docs that refer to "tez.am.container.session.delay-allocation-millis”. Can you point which doc you are referring to? This setting was removed in 0.5.x in favor of the min/max release timeouts. To achieve the same behavior as tez.am.container.session.delay-allocation

Re: [DISCUSS] What versions of hadoop should we support?

2014-12-09 Thread Hitesh Shah
e is… etc.. > If the Hadoop UI integration could be delivered with 2.2 that would be > awesome! > > best > Johannes > >> On 08 Dec 2014, at 19:02, Hitesh Shah wrote: >> >> Thanks for the info, Douglas. We will not be dropping support for 2.4 for >> sure

Re: [DISCUSS] What versions of hadoop should we support?

2014-12-11 Thread Hitesh Shah
>>> >>> Publishing multiple jars - do you mean per hadoop version supported ? >>> That'll likely be very difficult to use - what would the dependency be ? >>> Will there be binary compatibility issues if someone builds against tez jars >>> b

Re: Tez UI early trial?

2014-12-12 Thread Hitesh Shah
Hello Xiaoyong, Could you shed more light on the problems you have been encountering ( and with which version of the hadoop )? Some of the details on how to use YARN timeline are documented here: http://tez.apache.org/tez_yarn_timeline.html - let me know if that helps. Let us also know what ve

Re: Tez UI early trial?

2014-12-15 Thread Hitesh Shah
s > using Tez 0.6.0. > > The tar under /apps/tez/ is tez-0.6.0.2.2.0.0-1084.tar.gz > > Xiaoyong > > > -Original Message- > From: Hitesh Shah [mailto:hit...@apache.org] > Sent: Saturday, December 13, 2014 12:17 AM > To: user@tez.apache.o

Re: Tez UI early trial?

2014-12-15 Thread Hitesh Shah
ar under /apps/tez/ is tez-0.6.0.2.2.0.0-1084.tar.gz > > Xiaoyong > > > -Original Message- > From: Hitesh Shah [mailto:hit...@apache.org] > Sent: Saturday, December 13, 2014 12:17 AM > To: user@tez.apache.org > Subject: Re: Tez UI early trial? > > Hello Xiao

Re: is there way to set map and reducer number for hive on tez manually?

2015-01-05 Thread Hitesh Shah
Hello again, Wrong list for this question. Please send your hive specific question to the hive-user list ( u...@hive.apache.org ). thanks — Hitesh On Jan 5, 2015, at 2:23 PM, SkaterXu wrote: > is there way to set map and reducer number for hive on tez manually? > i want hive on tez to use a

Re: Tez Hive Job failing during initialization

2015-01-15 Thread Hitesh Shah
Hello Peter For the history logging exception, would you mind filing a jira for this issue ( https://issues.apache.org/jira/browse/TEZ ) and attach the yarn applications log obtained by running the “yarn logs -applicationId” command? The history logging happens on a separate thread and its ex

Re: tez map task and reduce task stay pending forerver

2015-01-28 Thread Hitesh Shah
Hello Thanks for tracking down the issue to the vcores setting. Let me dig into that. Some initial questions: - do you know if YARN has been configured to schedule on both memory and vcores i.e using the DominantResourceScheduler? - I am assuming that the max vcores per container is 1 but

Re: Questions about Tez under the hood

2015-01-28 Thread Hitesh Shah
l loop, it will match against any pending task. Sorting by next schedule time means the thread wakes up when it is time to run the next loop for a given container. Sorting by expiry would imply scanning the whole list to find all containers whose schedule time has elapsed. > >

Re: Failed to delete tez scratch data dir - kerberos secured

2015-01-30 Thread Hitesh Shah
Hi Johannes, Sorry - missed this email earlier. I have seen this a couple of times but have not been able to track down the root cause of this. Would you mind filing a jira for this with the logs from an AM where you observed this? Also, if you have a higher frequency of being able to reproduc

Re: operators in a Tez vertex

2015-02-05 Thread Hitesh Shah
Hello Robert, I believe currently the only way to do is this from the hive explain plan as most of this information is inaccessible to Tez ( it is passed in via the user payloads and only parsed by the Hive Processor ). I believe if you enable the ATSHooks in Hive, it publishes such informatio

Re: tez dot file location?

2015-02-07 Thread Hitesh Shah
To clarify, it will be in one of the log-dirs ( not the working dir ) of the AM’s container. If you can access the AM logs from the RM UI, you can try out using “http://people.apache.org/~gopalv/dagviz/“ When using “bin/yarn logs -applicationId ”, look for the files of container_…._01 wh

Re: reducers counters

2015-02-08 Thread Hitesh Shah
Hi Robert, Could you provide more details on the kind of Tez job you are running? For example, if I run orderedwordcount, I see the HDFS counters show up correctly: 2015-02-07 18:38:54,617 INFO [Dispatcher thread: Central] history.HistoryEventHandler: [HISTORY][DAG:dag_1423172314047_0008_1

Re: Unexpected containers allocated

2015-02-10 Thread Hitesh Shah
Just to be clear, if you are looking for AM debug logs, you can enable debug logging per app. Just use “word count -Dtez.am.log.level=DEBUG input output”. — Hitesh On Feb 10, 2015, at 4:02 AM, Fabio C. wrote: > Hi everyone, > I was running the tez wordcount example on a 6 nodes cluster. The i

Re: data locality

2015-02-23 Thread Hitesh Shah
Thanks for the feedback, Johannes. Would it be possible for you to file a jira for the performance issue that you are seeing with logs? Please strip out necessary data to hide the customer info, etc. The logs that would be most useful are: - comparison logs of an MR job vs a Tez job showing

Re: Parallel queries/dags running in same AM?

2015-03-09 Thread Hitesh Shah
A clarification for (2), you can share an AM across multiple users by using form of proxy users and passing in the required delegation tokens to talk to various services such as HDFS. Also, HiveServer2 when the doAs mode is set to false, runs all AMs as user hive but can effectively run queries

Re: What is recommended memory setting for tez.am and tez task?

2015-03-09 Thread Hitesh Shah
Hello Alexander, Are you using Tez natively or via Hive/Pig/Cascading, etc? To a large extent, most users I have encountered tend to have tez.am.resource.memory.mb sized to be between 4-8 GB though in some cases, ( until TEZ-776 is addressed ), this might need to increased for DAGs which have

Re: streamed splitting

2015-03-12 Thread Hitesh Shah
Hello Johannes, This is something we have discussed quite often but have not got around to implementing this. There might be an open jira related to “pipelining” of splits. If you cannot find it, please go ahead and create one. The general issues with these are: - how to handle dynamic crea

Re: error after installation - TezSession has already shutdown

2015-03-17 Thread Hitesh Shah
Hello First issue from the stack trace: "org.apache.tez.dag.api.TezUncheckedException: Invalid configuration of tez jars, tez.lib.uris is not defined in the configuration”. It looks like your first run failed due to this. The second run seemed to be configured correctly but failed for a diffe

Re: error after installation - TezSession has already shutdown

2015-03-17 Thread Hitesh Shah
ter error was > raised... > > > regards > > > ---- > On Tue, 3/17/15, Hitesh Shah wrote: > > Subject: Re: error after installation - TezSession has already shutdown > To: user@tez.apache.org > Received: Tuesday

Re: protobuf version clarification

2015-03-22 Thread Hitesh Shah
Hi Michael, Yes - that is indeed an error regarding the version of protobuf that needs to be used. Pretty much all of the Hadoop ecosystem components have standardized on protobuf-2.5.0. Newer versions would likely have worked assuming protobuf retained compatibility. In any case, you do need

Re: BufferTooSmallException

2015-03-23 Thread Hitesh Shah
Hi Chris, I don’t believe this issue has been seen before. Could you file a jira for this with the full application logs ( obtained via bin/yarn logs -application ) and the configuration used? thanks — Hitesh On Mar 23, 2015, at 1:01 PM, Chris K Wensel wrote: > Hey all > > We have a user r

Re: Build tez UI failed.

2015-03-24 Thread Hitesh Shah
Hi I just downloaded the src-tar for the 0.6.0 release and built it without any issues. At times, I have not seen this but others have faced issues at times ( based on general searches[1] ) where there have been temporal issues downloading node locally at times. [1] https://github.com/eirslett

Re: Build tez UI failed.

2015-03-24 Thread Hitesh Shah
timeline server and tomcat. > > and I also changed script/config.js to the correct timeline URL and RMUrl (my > timeline webapp port is 50040, RM webapp port is 50030) > > then I open http://x.x.x.x:8080/tez-ui-xx/index.html, which is a blank page. > > I am strictly refer to

Re: Tez blog on Hortonworks is outdate

2015-03-26 Thread Hitesh Shah
Thanks for the heads up, Azuryy. I will pass the information along to the relevant folks at Hortonworks. And yes, you are right. At some point, we merged the code into the common TezClient so that it became easier for the user to toggle session-mode on/off via configuration and not have to write

Re: Tez configuration parameters

2015-04-08 Thread Hitesh Shah
Not much of help but you can start with these: http://tez.apache.org/releases/0.6.0/tez-api-javadocs/org/apache/tez/dag/api/TezConfiguration.html http://tez.apache.org/releases/0.6.0/tez-runtime-library-javadocs/org/apache/tez/runtime/library/api/TezRuntimeConfiguration.html The above 2 files’ d

Re: container fails to start with malloc error

2015-04-15 Thread Hitesh Shah
Hi Johannes Not sure if anyone has seen this earlier. Do you know if the machines have enough memory to run the no. of tasks/containers that you are launching? Also, I am assuming that you are compiling and running against the same jdk version? Would you mind sharing the details on what java v

Re: Can WebHCat show Tez jobs?

2015-04-17 Thread Hitesh Shah
Hello Xiaoyong, I believe you might get better help on the hive user mailing lists given that templeton is part of hive. I am guessing that Templeton has some hooks to recognize indirectly launched MR jobs but may not have built out the same support to recognize tez jobs. thanks — Hitesh O

Re: How to Tuning Tez Task Performance

2015-04-24 Thread Hitesh Shah
@r7raul1984, would you mind filing a documentation jira for your question. The list that Rajesh provided might be good to formalize into a doc and/or wiki. Also, please take a look at https://issues.apache.org/jira/browse/TEZ-2294 to see all the list of parameters. If you see something off or n

Re: Tez taskcount log visualization

2015-04-27 Thread Hitesh Shah
As a general point for other users, although this is not fully supported, you can try running 0.5.3 with the ATS history logging enabled and then setup the UI using the 0.6 build with the UI configured to point to the ATS server. Almost all of the required data published to ATS should be in the

Re: Tez taskcount log visualization

2015-04-28 Thread Hitesh Shah
.0 and Tez is 0.53. > I try http://192.168.117.117:8288/ws/v1/timeline/TEZ_DAG_ID?limit=11 will > retrun result ; > But Tez UI always display . > <54AA3VC(8~{WI`O`(04-29-08-47-14).png> > > r7raul1...@163.com > > From: Hitesh Shah > Date: 2015-04-27 23:19 >

Re: hive sql on tez run forever

2015-05-05 Thread Hitesh Shah
This might be a mail that is better suited for the user@hive mailing list to start with. thanks — Hitesh On May 5, 2015, at 12:58 AM, r7raul1...@163.com wrote: > I change the sql where condition to (where t.update_time >= '2015-05-04') , > the sql can return result for a while. Because t.up

Re: hive sql on tez run forever

2015-05-11 Thread Hitesh Shah
t time. and at that time shuffle keep running for ever. > which version of tez r u using? > > > On 2015-05-06 06:02 , Hitesh Shah Wrote: > > This might be a mail that is better suited for the user@hive mailing list to > start with. > > thanks > — Hitesh >

[ANNOUNCE] New Apache Tez Committer - Sreenath Somarajapuram

2015-05-13 Thread Hitesh Shah
Hi all, I am very pleased to announce that Sreenath Somarajapuram has been voted in as a committer in the Apache Tez project. We appreciate all the work Sreenath has put into the project so far ( especially for the Tez UI ), and are looking forward to his future contributions. Welcome aboard

Re: is there a way to map Tez UI graphs back to the script?

2015-05-15 Thread Hitesh Shah
This will require a change in Hive/Pig and a modification to the Tez UI. For every vertex, there is a ProcessorDescriptor. This supports a setHistoryText API. /** * Provide a human-readable version of the user payload that can be * used in the TEZ UI * @param historyText History text * For

Re: Enable local fetch optimization by default

2015-05-15 Thread Hitesh Shah
Hello Amit, For the most part, all that local fetch does is that in the case where the upstream vertex's output is on the same host where the downstream vertex task is running, the fetcher reads the data directly from disk instead of going via the http-based shuffle handler. This is an optimiz

Re: [DISCUSS] Drop Java 6 support in 0.8

2015-05-15 Thread Hitesh Shah
+1 on dropping 1.6 support from 0.8.0 onwards. @Mohammad, Hadoop is dropping support for 1.6 from 2.7.0 onwards. I am guessing other ecosystem projects will soon follow suit. — Hitesh On May 15, 2015, at 2:16 PM, Mohammad Islam wrote: > Hi Sid, > What are the statuses of other Hadoop project

Re: is there a way to map Tez UI graphs back to the script?

2015-05-16 Thread Hitesh Shah
. thanks — Hitesh On May 16, 2015, at 2:15 AM, Xiaoyong Zhu wrote: > I see, thanks Hitesh! > > That being said - currently Hive does not use ProcessorDescriptor to log the > query info, right? > > Xiaoyong > > -Original Message- > From: Hitesh Shah [mailto

Re: [DISCUSS] Drop Java 6 support in 0.8

2015-05-16 Thread Hitesh Shah
: > +1 non-binding on dropping 1.6. However given that we compile against > hadoop2.4 and hadoop2.2 which has requireJavaVersion 1.6 will it be an issue? > > > > > On 5/16/15, 3:54 AM, "Hitesh Shah" wrote: > >> +1 on dropping 1.6 support from 0.8

Re: Ten processor with multiple inputs

2015-05-18 Thread Hitesh Shah
There is nothing that prevents a processor running and finishing without even reading any data from any input. The only point when the processor blocks is when it tries to read data from a particular input that has not yet finished fetching all of its data. That said, a processor cannot yet que

Re: Ten processor with multiple inputs

2015-05-18 Thread Hitesh Shah
This is with respect to how work is assigned to a Task. For a shuffle edge, a Task’s input is determined based on the partitions and how partitions are assigned to a Task. For a vertex reading data from HDFS ( initial input ). this is effectively random as the input data is split up and then ass

Re: [DISCUSS] Drop Java 6 support in 0.8

2015-05-18 Thread Hitesh Shah
ns don¹t matter, what matters > are downstream sourceVersions of tools which use Tez. > > Cheers, > Gopal > > On 5/16/15, 12:07 PM, "Hitesh Shah" wrote: > >> Excellent point @Prakash. I would probably change my vote to a -0 on that >> point itself. >>

Re: Tez local mode hanging in big testsuite

2015-05-21 Thread Hitesh Shah
Hello Andre, Could you file a JIRA for this and upload the logs around the point where it hangs? thanks — Hitesh On May 21, 2015, at 7:55 AM, Andre Kelpe wrote: > Hi, > > we have a big test suite for lingual, our SQL layer for cascading. We are > trying very hard to make it work correctly

Re: Tez log location?

2015-05-21 Thread Hitesh Shah
There is some history logging done that can be enabled via the SimpleHistoryLogger. This activates by default if ATS logger is not enabled. This is not fully compatible with the ATS data and also as it is mostly experimental, it may not have all the data. To use it, you can configure the “tez.h

Re: EOFException - TezJob - Cannot submit DAG

2015-05-22 Thread Hitesh Shah
Hello Patcharee Could you start with sending a mail to users@pig to see if they have come across this issue first? Also, can you check the application master logs to see if there are any errors ( might be useful to enable DEBUG level logging to get more information )? thanks — Hitesh On May

Re: tez timeline domain configuration?

2015-05-26 Thread Hitesh Shah
In hadoop 2.4, YARN timeline does not really support proper security. It does not have any ACL support ( implemented using something called domains ). The property that you mentioned needs to be set as it is a form of a warning to the user that you are running Tez with YARN Timeline with ACLs e

Re: How to find query log when I use hiveserver2 on tez?

2015-05-27 Thread Hitesh Shah
These are hive-specific questions. Could you please send this mail to the user@hive list instead? thanks — Hitesh On May 26, 2015, at 10:28 PM, r7raul1...@163.com wrote: > How to find query log when I use hiveserver2 on tez? My enviroment is hive > 1.1.0+tez0.53 use beeline to connect hiveser

Re: Tez lauche container error when use UseG1GC

2015-05-29 Thread Hitesh Shah
To clarify, given that the error is showing up with container_1432885077153_0004_01_05, that means that the AM launched properly. Use “bin/yarn logs -applicationId application_1432885077153_0004" to get the logs. See if there are any errors for the logs for container_1432885077153_0004_01

[ANNOUNCE] New Apache Tez PMC Member - Jeff Zhang

2015-05-31 Thread Hitesh Shah
Hi all, I am very pleased to announce that Jeff Zhang has been voted in as a member of the Apache Tez PMC. We appreciate all the work Jeff has put into the project so far, and are looking forward to his future (greater) contributions. Please join me in welcoming Jeff to the Tez PMC. Congratul

Re: hive 1.1.0 tez0.7 hadoop 2.5.0 run query NoClassDefFoundError

2015-06-03 Thread Hitesh Shah
If you compiled Tez against hadoop-2.6.0 and are deploying it on a hadoop-2.5.0 cluster, you should disable tez acls as YARN timeline in 2.5.0 does not support ACLs. Please set tez.am.acls.enabled to false as the Timeline layer is trying to enforce acls for the history data. thanks — Hitesh O

Re: hive 1.1.0+tez0.7+hadoop2.5.0-cdh5.2.0 when use TEZ UI throw exception

2015-06-04 Thread Hitesh Shah
There might be a better way to approach this. You can add the following to the top-level pom.xml in the repositories section: cloudera-repo Cloudera Repository https://repository.cloudera.com/artifactory/cloudera-repos true false

Re: Tez ATS integration

2015-06-15 Thread Hitesh Shah
Templeton has had some bugs related to not adding tez-site.xml to the launcher task from which the actual hive job runs. I think they are fixed in the newer versions. Can you try setting the tez ats property into webhcat-site or hive-site and see if it propagates all the way to the MR task which

Re: error on hive insert query

2015-06-16 Thread Hitesh Shah
Unless you can pinpoint the problem to something Tez specific, hive-specific questions might be better off being asked on user@hive initially as there is a larger group there that understands Hive as compared to the Tez community. FWIW, TezTask error 1 means “something in the Hive layer using T

Re: 2nd port on Application Master

2015-06-17 Thread Hitesh Shah
This may likely be TaskAttemptListenerImpTezDag used for communication between the AM and the task containers. There are no configuration knobs for it currently as it is always internal to a cluster. If you don’t mind, can you file a jira and contribute a patch to add a clarifying log message t

Re: hive 1.1.0 on tez0.53 error

2015-06-17 Thread Hitesh Shah
That particular log is a red herring and not really an issue that is causing the failure. The main problem based on the log is this: 2015-06-17 18:00:43,543 INFO [AsyncDispatcher event handler] history.HistoryEventHandler: [HISTORY][DAG:dag_1433219182593_180456_3][Event:DAG_FINISHED]: dagId

Re: hive 1.1.0 on tez0.53 error

2015-06-17 Thread Hitesh Shah
nostics=[Vertex received Kill while in RUNNING state., Vertex killed as > other vertex failed. failedTasks:0, Vertex vertex_1433219182593_180456_3_00 > [Map 1] killed/failed due to:null] > DAG failed due to vertex failure. failedVertices:1 killedVertices:1 > FAILED: Execution Error

Re: hive 1.1.0 on tez0.53 error

2015-06-18 Thread Hitesh Shah
null] > DAG failed due to vertex failure. failedVertices:1 killedVertices:1 > FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.tez.TezTask > > I think maybe first successfule job delete tez-conf.pb ? > > r7raul1...@163.com > > From:

Re: Error while building Tez UI

2015-06-19 Thread Hitesh Shah
The build tries to download node and then install it. Did the download fail for some reason? You can try running in debug mode i.e. “mvn -X” to see why it could be failing. — Hitesh On Jun 19, 2015, at 8:17 AM, amit kumar wrote: > Hi, > > > I am trying to build tez-0.7.0, but can not build

Re: hive 1.1.0 on tez0.53 error

2015-06-19 Thread Hitesh Shah
Yes something along those lines. This might help a bit more: http://techtonka.com/?p=174 thanks — Hitesh On Jun 18, 2015, at 6:44 PM, r7raul1...@163.com wrote: > > log like this hdfs-audit.log.9 ? > r7raul1...@163.com > > From: Hitesh Shah > Date: 2015-06-19 02:28 > T

Re: Error while building Tez UI

2015-06-23 Thread Hitesh Shah
o if there is any location from where I can download the jar > (artifact) directly instead of building it. > > It would be very helpful. > > Thanks. > > > On Fri, Jun 19, 2015 at 3:31 PM, Hitesh Shah wrote: > The build tries to download node and then install it.

Re: Error while running Hive queries over tez

2015-06-24 Thread Hitesh Shah
The error seems to indicate that the create dir/write to HDFS failed. Can you check compare what user you are running as and whether the user has permissions to create/write to the directories in the path below? Furthermore, you may wish to check if the datanodes are alive and also look for erro

Re: Error while running Hive queries over tez

2015-06-24 Thread Hitesh Shah
A couple of things to do: 1) (optional) For tez.lib.uris, set it to "${fs.defaultFS}/apps/tez/tez.tar.gz” - this tez.tar.gz should come from tez-dist/target/. Given that your basic job is working, you can ignore this for now but it is the recommended way to deploy tez. 2) On a fresh setup, do

[ANNOUNCE] Apache Tez 0.5.4 release

2015-06-29 Thread Hitesh Shah
Hello folks, The Apache Tez team is proud to announce the release of Apache Tez version 0.5.4 The Apache Tez project is aimed at creating a framework to build efficient and scalable data processing applications that can be modeled as data flow graphs. This release is a maintenance release fo

Re: fails to alter table concatenate

2015-06-30 Thread Hitesh Shah
Move to user@hive. BCC’ed user@tez. — Hitesh On Jun 30, 2015, at 1:44 AM, patcharee wrote: > Hi, > > I am using hive 0.14 + tez 0.5. It fails to alter table concatenate > occasionally (see the exception below). It is strange that it fails from time > to time not predictable. However, it wor

Re: Hive Tez support matrix

2015-07-07 Thread Hitesh Shah
From a Tez perspective, there was a major compatibility change between Tez 0.4 and Tez 0.5. However, Tez-0.7.x and Tez-0.6.x are compatible with Tez-0.5.x. I believe Hive 0.13 is compatible only with Tez 0.4. For Hive 0.14 onwards ( including the Hive-1.x. releases ), they should work with any

Re: Tez Counter question

2015-07-08 Thread Hitesh Shah
For data skew, you may also want to consider enabling “tez.task.generate.counters.per.io”. This enables counters on a per edge basis which is more helpful for complex DAGs. — Hitesh On Jul 8, 2015, at 10:29 PM, Joe Zhang (SDE) wrote: > Hi Rajesh: > > Thanks for your reply. I want to know mo

Re: Error compiling tez (0.6.1 and 0.7.0) from scratch

2015-07-15 Thread Hitesh Shah
Thanks for digging into the issue, Rajat. Mind filing a jira for this and uploading your patch ( without the frontend plugin change)? thanks — Hitesh On Jul 15, 2015, at 3:53 PM, Rajat Jain wrote: > I run this command: > > mvn clean install -DskipTests > > It automatically picks up Hadoo

Re: hive on tez strange issue

2015-07-15 Thread Hitesh Shah
I think this might be a bug somewhere in YARN ( specifically the fair scheduler layer ) as Tez has no control over the data being displayed. The resource usage and accounting is all handled within the ResourceManager in YARN. You should probably file a jira against YARN for this issue and attac

Re: Problem in building tez-0.7

2015-07-17 Thread Hitesh Shah
Hi Sachin, Quite a few folks have reported seeing this and there is an open jira for this. From a previous reply by Prakash: {quote} This is a known bug. TEZ-2560 is tracking it – will put in a fix soon. For now you can use a maven version < 3.3 OR change the frontend-maven-plugin version to

  1   2   3   >