hello Paul, Yes, I can access AM web UI.
I remark that the problem it's caused by SSL/TLS access Enabled( ssl-enabled: true)_. - https://xxxxxxxxx:10048/rest/status : work fine in the browser I think I have to deal with certificates on AM host. Do you have an idea? Thanks. On Mon, Jan 14, 2019 at 4:53 PM Paul Rogers <par0...@yahoo.com.invalid> wrote: > Hi, > > Can you reach the AM web UI? The Web UI URL was shown below. It also > should have been given when you started DoY. > > I notice that you're using SSL/TLS access. Doing so requires the right > certificates on the AM host. Again, trying to connect via your browser may > help identify if that works. > > If the Web UI works, then check the host name and port number in your > browser compared to that shown in the error message. > > The resize command on the command line does nothing other than some > validation, then it sends the URL shown below. You can try entering the URL > directly into your browser. Again, if that fails, there is something amiss > with your config. If that works, then we'll have to figure out what might > be wrong with the DoY command line tool. > > Please try out the above and let us know what you learn. > > Thanks, > - Paul > > > > On Monday, January 14, 2019, 7:30:44 AM PST, Kwizera hugues Teddy < > nbted2...@gmail.com> wrote: > > Hello all, > > I am experiencing an error on Resize and Status . > The errors are from the REST call on the AM. > > command : $DRILL_HOME/bin/drill-on-yarn.sh --site $DRILL_SITE status > Result: > Application ID: xxxxxxxxxxxxxxxx Application State: RUNNING Host: > xxxxxxxxxxxxxxxx Queue: root.xxxxx.default User: xxxxxxxx Start Time: > 2019-01-14 14:56:29 Application Name: Drill-on-YARN-cluster_01 Tracking > URL: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Failed to get AM > status > REST request failed: https://xxxxxxxxxxxxxxx:9048/rest/status > > Command : $DRILL_HOME/bin/drill-on-yarn.sh --site $DRILL_SITE resize > Result : > Resizing cluster for Application ID: > xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx > Resize failed: REST request failed: > https://xxxxxxxxxxxxxxx:9048/rest/shrink/1 > > I didn't found how I can resolve this issue. maybe someone can help me > > Thanks. > > > > On Sat, Jan 12, 2019 at 8:30 AM Kwizera hugues Teddy <nbted2...@gmail.com> > wrote: > > > Hello , > > > > Other option work . > > > > As you say an update is needed in docs and the remove of wrong > > information. > > > > Thanks. > > > > On Sat, Jan 12, 2019, 08:10 Abhishek Girish <agir...@apache.org wrote: > > > >> Hello Teddy, > >> > >> I don't recollect a restart option for the drill-on-yarn.sh script. I've > >> always used a combination of stop and start, like Paul mentions. Could > you > >> please try if that works and get back to us? We could certainly have a > >> minor enhancement to support restart - until then i'll request Bridget > to > >> update the documentation. > >> > >> Regards, > >> Abhishek > >> > >> On Fri, Jan 11, 2019 at 11:05 PM Kwizera hugues Teddy < > >> nbted2...@gmail.com> > >> wrote: > >> > >> > Hello Paul , > >> > > >> > Thanks you for your response with some interesting information(files > in > >> > /tmp). > >> > > >> > For my side all other command line work > normally(start|stop|status...|) > >> > but no restart(this option not recognized). I tried to search the code > >> > source and I found that the restart command is not implemented . then > I > >> > wonder why the documentation does not match the source code ?. > >> > > >> > Thanks .Teddy > >> > > >> > > >> > On Sat, Jan 12, 2019, 02:39 Paul Rogers <par0...@yahoo.com.invalid > >> wrote: > >> > > >> > > Let's try to troubleshoot. Does the combination of stop and start > >> work? > >> > If > >> > > so, then there could be a bug with the restart command itself. > >> > > > >> > > If neither start nor stop work, it could be that you are missing the > >> > > application ID file created when you first started DoY. Some > >> background. > >> > > > >> > > When we submit an app to YARN, YARN gives us an app ID. We need this > >> in > >> > > order to track down the app master for DoY so we can send it > commands > >> > later. > >> > > > >> > > When the command line tool starts DoY, it writes the YARN app ID to > a > >> > > file. Can't remember the details, but it is probably in the > >> $DRILL_SITE > >> > > directory. The contents are, as I recall, a long hexadecimal string. > >> > > > >> > > When you invoke the command line, the tool reads this file to figure > >> to > >> > > track down the DoY app master. The tool then sends commands to the > app > >> > > master: in this case, a request to shut down. Then, for reset, the > >> tool > >> > > will communicate with YARN to start a new instance. > >> > > > >> > > The tool is suppose to give detailed error messages. Did you get > any? > >> > That > >> > > might tell us which of these steps failed. > >> > > > >> > > Can you connect to the DoY Web UI at the URL provided when you > started > >> > > DoY? If you can, this means that the DoY App Master is up and > running. > >> > > > >> > > Are you running the client from the same node on which you started > it? > >> > > That file I mentioned is local to the "DoY client" machine; it is > not > >> in > >> > > DFS. > >> > > > >> > > Then, there is one more very obscure bug you can check. On some > >> > > distributions, the YARN task files are written to the /tmp > directory. > >> > Some > >> > > Linux systems remove these files from time to time. Once the files > are > >> > > gone, YARN can no longer control its containers: it won't be able to > >> stop > >> > > the app master or the Drillbit containers. There are two fixes. > >> First, go > >> > > kill all the processes by hand. Then, move the YARN state files out > of > >> > > /tmp, or exclude YARN's files from the periodic cleanup. > >> > > > >> > > Try some of the above and let us know what you find. > >> > > > >> > > Also, perhaps Abhishek can offer some suggestions as he tested the > >> heck > >> > > out of the feature and may have additional suggestions. > >> > > > >> > > Thanks, > >> > > - Paul > >> > > > >> > > > >> > > > >> > > On Friday, January 11, 2019, 7:46:55 AM PST, Kwizera hugues Teddy > >> < > >> > > nbted2...@gmail.com> wrote: > >> > > > >> > > hello, > >> > > > >> > > 2 weeks ago, I began to discover DoY. Today by reading drill > >> documents ( > >> > > https://drill.apache.org/docs/appendix-a-release-note-issues/ ) I > saw > >> > that > >> > > we can restart drill cluster by : > >> > > > >> > > $DRILL_HOME/bin/drill-on-yarn.sh --site $DRILL_SITE restart > >> > > > >> > > But doesn't work when I tested it. > >> > > > >> > > No idea about it? > >> > > > >> > > Thanks. > >> > > > >> > > > >> > > > >> > > > >> > > On Wed, Jan 2, 2019 at 3:18 AM Paul Rogers > <par0...@yahoo.com.invalid > >> > > >> > > wrote: > >> > > > >> > > > Hi Charles, > >> > > > > >> > > > Your engineers have identified a common need, but one which is > very > >> > > > difficult to satisfy. > >> > > > > >> > > > TL;DR: DoY gets as close to the requirements as possible within > the > >> > > > constraints of YARN and Drill. But, future projects could do more. > >> > > > > >> > > > Your engineers want resource segregation among tenants: > >> multi-tenancy. > >> > > > This is very difficult to achieve at the application level. > Consider > >> > > Drill. > >> > > > It would need some way to identify users to know which tenant they > >> > belong > >> > > > to. Then, Drill would need a way to enqueue users whose queries > >> would > >> > > > exceed the memory or CPU limit for that tenant. Plus, Drill would > >> have > >> > to > >> > > > be able to limit memory and CPU for each query. Much work has been > >> done > >> > > to > >> > > > limit memory, but CPU is very difficult. Mature products such as > >> > Teradata > >> > > > can do this, but Teradata has 40 years of effort behind it. > >> > > > > >> > > > Since it is hard to build multi-tenancy in at the app level (not > >> > > > impossible, just very, very hard), the thought is to apply it at > the > >> > > > cluster level. This is done in YARN via limiting the resources > >> > available > >> > > to > >> > > > processes (typically map/reduce) and to limit the number of > running > >> > > > processes. Works for M/R because each map task uses disk to > shuffle > >> > > results > >> > > > to a reduce task, so map and reduce tasks can run asynchronously. > >> > > > > >> > > > For tools such as Drill, which do in-memory processing (really, > >> > > > across-the-network exchanges), both the sender and receiver have > to > >> run > >> > > > concurrently. This is much harder to schedule than async m/r > tasks: > >> it > >> > > > means that the entire Drill cluster (of whatever size) be up and > >> > running > >> > > to > >> > > > run a query. > >> > > > > >> > > > The start-up time for Drill is far, far longer than a query. So, > it > >> is > >> > > not > >> > > > feasible to use YARN to launch a Drill cluster for each query the > >> way > >> > you > >> > > > would do with Spark. Instead, under YARN, Drill is a long running > >> > service > >> > > > that handles many queries. > >> > > > > >> > > > Obviously, this is not ideal: I'm sure your engineers want to use > a > >> > > > tenant's resources for Drill when running queries, else for Spark, > >> > Hive, > >> > > or > >> > > > maybe TensorFlow. If Drill has to be long-running, I'm sure they's > >> like > >> > > to > >> > > > slosh resources between tenants as is done in YARN. As noted > above, > >> > this > >> > > is > >> > > > a hard problem that DoY did not attempt to solve. > >> > > > > >> > > > One might suggest that Drill grab resources from YARN when Tenant > A > >> > wants > >> > > > to run a query, and release them when that tenant is done, > grabbing > >> new > >> > > > resources when Tenant B wants to run. Impala tried this with Llama > >> and > >> > > > found it did not work. (This is why DoY is quite a bit simpler; no > >> > reason > >> > > > to rerun a failed experiment.) > >> > > > > >> > > > Some folks are looking to Kubernetes (K8s) as a solution. But, > that > >> > just > >> > > > replaces YARN with K8s: Drill is still a long-running process. > >> > > > > >> > > > To solve the problem you identify, you'll need either: > >> > > > > >> > > > * A bunch of work in Drill to build multi-tenancy into Drill, or > >> > > > * A cloud-like solution in which each tenant spins up a Drill > >> cluster > >> > > > within its budget, spinning it down, or resizing it, to stay with > an > >> > > > overall budget. > >> > > > > >> > > > The second option can be achieved under YARN with DoY, assuming > that > >> > DoY > >> > > > added support for graceful shutdown (or the cluster is reduced in > >> size > >> > > only > >> > > > when no queries are active.) Longer-term, a more modern solution > >> would > >> > be > >> > > > Drill-on-Kubernetes (DoK?) which Abhishek started on. > >> > > > > >> > > > Engineering is the art of compromise. The question for your > >> engineers > >> > is > >> > > > how to achieve the best result given the limitations of the > software > >> > > > available today. At the same time, helping the Drill community > >> improve > >> > > the > >> > > > solutions over time. > >> > > > > >> > > > Thanks, > >> > > > - Paul > >> > > > > >> > > > > >> > > > > >> > > > On Sunday, December 30, 2018, 9:38:04 PM PST, Charles Givre < > >> > > > cgi...@gmail.com> wrote: > >> > > > > >> > > > Hi Paul, > >> > > > Here’s what our engineers said: > >> > > > > >> > > > From Paul’s response, I understand that there is a slight > confusion > >> > > around > >> > > > how multi-tenancy has been enabled in our data lake. > >> > > > > >> > > > Some more details on this – > >> > > > > >> > > > Drill already has the concept of multitenancy where we can have > >> > multiple > >> > > > drill clusters running on the same data lake enabled through > >> different > >> > > > ports and zookeeper. But, all of this is launched through the same > >> hard > >> > > > coded yarn queue that we provide as a config parameter. > >> > > > > >> > > > In our data lake, each tenant has a certain amount of compute > >> capacity > >> > > > allotted to them which they can use for their project work. This > is > >> > > > provisioned through individual YARN queues for each tenant > (resource > >> > > > caging). This restricts the tenants from using cluster resources > >> > beyond a > >> > > > certain limit and not impacting other tenants at the same time. > >> > > > > >> > > > Access to these YARN queues is provisioned through ACL > memberships. > >> > > > > >> > > > —— > >> > > > > >> > > > Does this make sense? Is this possible to get Drill to work in > this > >> > > > manner, or should we look into opening up JIRAs and working on new > >> > > > capabilities? > >> > > > > >> > > > > >> > > > > >> > > > > On Dec 17, 2018, at 21:59, Paul Rogers > <par0...@yahoo.com.INVALID > >> > > >> > > > wrote: > >> > > > > > >> > > > > Hi Kwizera, > >> > > > > I hope my answer to Charles gave you the information you need. > If > >> > not, > >> > > > please check out the DoY documentation or ask follow-up questions. > >> > > > > Key thing to remember: Drill is a long-running YARN service; > >> queries > >> > DO > >> > > > NOT go through YARN queues, they go through Drill directly. > >> > > > > > >> > > > > Thanks, > >> > > > > - Paul > >> > > > > > >> > > > > > >> > > > > > >> > > > > On Monday, December 17, 2018, 11:01:04 AM PST, Kwizera hugues > >> > Teddy > >> > > < > >> > > > nbted2...@gmail.com> wrote: > >> > > > > > >> > > > > Hello, > >> > > > > Same questions , > >> > > > > I would like to know how drill deal with this yarn > fonctionality? > >> > > > > Cheers. > >> > > > > > >> > > > > On Mon, Dec 17, 2018, 17:53 Charles Givre <cgi...@gmail.com > >> wrote: > >> > > > > > >> > > > >> Hello all, > >> > > > >> We are trying to set up a Drill cluster on our corporate data > >> lake. > >> > > Our > >> > > > >> cluster requires dynamic YARN queue allocation for multi-tenant > >> > > > >> environment. Is this something that Drill supports or is > there a > >> > > > >> workaround? > >> > > > >> Thanks! > >> > > > >> —C > >> > > > > >> > > >> > >