Maxim, This is very interesting. Would you be interested in writing an Accumulo blog post about your experience? If you are interested I can help.
Keith On Tue, Jan 15, 2019 at 10:03 AM Maxim Kolchin <[email protected]> wrote: > > Hi, > > I just wanted to leave intermediate feedback on the topic. > > So far, Accumulo works pretty well on top of Google Storage. The > aforementioned issue still exists, but it doesn't break anything. However, I > can't give you any useful performance numbers at the moment. > > The cluster: > > - master (with zookeeper) (n1-standard-1) + 2 tservers (n1-standard-4) > - 32+ billlion entries > - 5 tables (excluding system tables) > > Some averaged numbers from two use cases: > > - batch write into pre-splitted tables with 40 client machines + 4 tservers > (n1-standard-4) - max speed 1.5M entries/sec. > - sequential read with 2 client iterators (1 - filters by key, 2- filters by > timestamp), with 5 client machines + 2 tservers (n1-standard-4 ) and less > than 60k entries returned - max speed 1M+ entries/sec. > > Maxim > > On Mon, Jun 25, 2018 at 12:57 AM Christopher <[email protected]> wrote: >> >> Ah, ok. One of the comments on the issue led me to believe that it was the >> same issue as the missing custom log closer. >> >> On Sat, Jun 23, 2018, 01:10 Stephen Meyles <[email protected]> wrote: >>> >>> > I'm not convinced this is a write pattern issue, though. I commented on.. >>> >>> The note there suggests the need for a LogCloser implementation; in my >>> (ADLS) case I've written one and have it configured - the exception I'm >>> seeing involves failures during writes, not during recovery (though it then >>> leads to a need for recovery). >>> >>> S. >>> >>> On Fri, Jun 22, 2018 at 4:33 PM, Christopher <[email protected]> wrote: >>>> >>>> Unfortunately, that feature wasn't added until 2.0, which hasn't yet been >>>> released, but I'm hoping it will be later this year. >>>> >>>> However, I'm not convinced this is a write pattern issue, though. I >>>> commented on >>>> https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103#issuecomment-399608543 >>>> >>>> On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles <[email protected]> wrote: >>>>> >>>>> Knowing that HBase has been run successfully on ADLS, went looking there >>>>> (as they have the same WAL write pattern). This is informative: >>>>> >>>>> >>>>> https://www.cloudera.com/documentation/enterprise/5-12-x/topics/admin_using_adls_storage_with_hbase.html >>>>> >>>>> which suggests a need to split the WALs off on HDFS proper versus ADLS >>>>> (or presumably GCS) barring changes in the underlying semantics of each. >>>>> AFAICT you can't currently configure Accumulo to send WAL logs to a >>>>> separate cluster - is this correct? >>>>> >>>>> S. >>>>> >>>>> >>>>> On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles <[email protected]> wrote: >>>>>> >>>>>> > Did you try to adjust any Accumulo properties to do bigger writes less >>>>>> > frequently or something like that? >>>>>> >>>>>> We're using BatchWriters and sending reasonable larges batches of >>>>>> Mutations. Given the stack traces in both our cases are related to WAL >>>>>> writes it seems like batch size would be the only tweak available here >>>>>> (though, without reading the code carefully it's not even clear to me >>>>>> that is impactful) but if there others have suggestions I'd be happy to >>>>>> try. >>>>>> >>>>>> Given we have this working well and stable in other clusters atop >>>>>> traditional HDFS I'm currently pursuing this further with the MS to >>>>>> understand the variance to ADLS. Depending what emerges from that I may >>>>>> circle back with more details and a bug report and start digging in more >>>>>> deeply to the relevant code in Accumulo. >>>>>> >>>>>> S. >>>>>> >>>>>> >>>>>> On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin <[email protected]> >>>>>> wrote: >>>>>>> >>>>>>> > If somebody is interested in using Accumulo on GCS, I'd like to >>>>>>> > encourage them to submit any bugs they encounter, and any patches (if >>>>>>> > they are able) which resolve those bugs. >>>>>>> >>>>>>> I'd like to contribute a fix, but I don't know where to start. We tried >>>>>>> to get any help from the Google Support about [1] over email, but they >>>>>>> just say that the GCS doesn't support such write pattern. In the end, >>>>>>> we can only guess how to adjust the Accumulo behaviour to minimise >>>>>>> broken connections to the GCS. >>>>>>> >>>>>>> BTW although we observe this exception, the tablet server doesn't fail, >>>>>>> so it means that after some retries it is able to write WALs to GCS. >>>>>>> >>>>>>> @Stephen, >>>>>>> >>>>>>> > as discussions with MS engineers have suggested, similar to the GCS >>>>>>> > thread, that small writes at high volume are, at best, suboptimal for >>>>>>> > ADLS. >>>>>>> >>>>>>> Did you try to adjust any Accumulo properties to do bigger writes less >>>>>>> frequently or something like that? >>>>>>> >>>>>>> [1]: https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103 >>>>>>> >>>>>>> Maxim >>>>>>> >>>>>>> On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <[email protected]> >>>>>>> wrote: >>>>>>>> >>>>>>>> I think we're seeing something similar but in our case we're trying to >>>>>>>> run Accumulo atop ADLS. When we generate sufficient write load we >>>>>>>> start to see stack traces like the following: >>>>>>>> >>>>>>>> [log.DfsLogger] ERROR: Failed to write log entries >>>>>>>> java.io.IOException: attempting to write to a closed stream; >>>>>>>> at >>>>>>>> com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88) >>>>>>>> at >>>>>>>> com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77) >>>>>>>> at >>>>>>>> org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57) >>>>>>>> at >>>>>>>> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48) >>>>>>>> at java.io.DataOutputStream.write(DataOutputStream.java:88) >>>>>>>> at java.io.DataOutputStream.writeByte(DataOutputStream.java:153) >>>>>>>> at >>>>>>>> org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87) >>>>>>>> at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537) >>>>>>>> >>>>>>>> We have developed a rudimentary LogCloser implementation that allows >>>>>>>> us to recover from this but overall performance is significantly >>>>>>>> impacted by this. >>>>>>>> >>>>>>>> > As for the WAL closing issue on GCS, I recall a previous thread >>>>>>>> > about that >>>>>>>> >>>>>>>> I searched more for this but wasn't able to find anything, nor similar >>>>>>>> re: ADL. I am also curious about the earlier question: >>>>>>>> >>>>>>>> >> Does Accumulo have a specific write pattern [to WALs], so that file >>>>>>>> >> system may not support it? >>>>>>>> >>>>>>>> as discussions with MS engineers have suggested, similar to the GCS >>>>>>>> thread, that small writes at high volume are, at best, suboptimal for >>>>>>>> ADLS. >>>>>>>> >>>>>>>> Regards >>>>>>>> >>>>>>>> Stephen >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Jun 20, 2018 at 11:20 AM, Christopher <[email protected]> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> For what it's worth, this is an Apache project, not a Sqrrl project. >>>>>>>>> Amazon is free to contribute to Accumulo to improve its support of >>>>>>>>> their platform, just as anybody is free to do. Amazon may start >>>>>>>>> contributing more as a result of their acquisition... or they may >>>>>>>>> not. There is no reason to expect that their acquisition will have >>>>>>>>> any impact whatsoever on the platforms Accumulo supports, because >>>>>>>>> Accumulo is not, and has not ever been, a Sqrrl project (although >>>>>>>>> some Sqrrl employees have contributed), and thus will not become an >>>>>>>>> Amazon project. It has been, and will remain, a vendor-neutral Apache >>>>>>>>> project. Regardless, we welcome contributions from anybody which >>>>>>>>> would improve Accumulo's support of any additional platform >>>>>>>>> alternatives to HDFS, whether it be GCS, S3, or something else. >>>>>>>>> >>>>>>>>> As for the WAL closing issue on GCS, I recall a previous thread about >>>>>>>>> that... I think a simple patch might be possible to solve that issue, >>>>>>>>> but to date, nobody has contributed a fix. If somebody is interested >>>>>>>>> in using Accumulo on GCS, I'd like to encourage them to submit any >>>>>>>>> bugs they encounter, and any patches (if they are able) which resolve >>>>>>>>> those bugs. If they need help submitting a fix, please ask on the >>>>>>>>> dev@ list. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts >>>>>>>>> <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>> Maxim, >>>>>>>>>> >>>>>>>>>> Interesting that you were able to run A on GCS. I never thought of >>>>>>>>>> that--good to know. >>>>>>>>>> >>>>>>>>>> Since I am now an AWS guy (at least or the time being), in light of >>>>>>>>>> the fact that Amazon purchased Sqrrl, I am interested to see what >>>>>>>>>> develops. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin >>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Geoffry, >>>>>>>>>>> >>>>>>>>>>> Thank you for the feedback! >>>>>>>>>>> >>>>>>>>>>> Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs >>>>>>>>>>> and with GCS instead of HDFS. And I used Google Dataproc to run >>>>>>>>>>> Hadoop jobs on Accumulo. Almost everything was good until I've not >>>>>>>>>>> faced some connection issues with GCS. Quite often, the connection >>>>>>>>>>> to GCS breaks on writing or closing WALs. >>>>>>>>>>> >>>>>>>>>>> To all, >>>>>>>>>>> >>>>>>>>>>> Does Accumulo have a specific write pattern, so that file system >>>>>>>>>>> may not support it? Are there Accumulo properties which I can play >>>>>>>>>>> with to adjust the write pattern? >>>>>>>>>>> >>>>>>>>>>> [1]: https://github.com/cybermaggedon/accumulo-gs >>>>>>>>>>> [2]: https://github.com/cybermaggedon/accumulo-docker >>>>>>>>>>> >>>>>>>>>>> Thank you! >>>>>>>>>>> Maxim >>>>>>>>>>> >>>>>>>>>>> On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts >>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>> I tried running Accumulo on Google. I first tried running it on >>>>>>>>>>>> Google's pre-made Hadoop. I found the various file paths one must >>>>>>>>>>>> contend with are different on Google than on a straight download >>>>>>>>>>>> from Apache. It seems they moved things around. To counter this, >>>>>>>>>>>> I installed my own Hadoop along with Zookeeper and Accumulo on a >>>>>>>>>>>> Google node. All went well until one fine day when I could no >>>>>>>>>>>> longer log in. It seems Google had pushed out some changes over >>>>>>>>>>>> night that broke my client side Google Cloud installation. Google >>>>>>>>>>>> referred the affected to a lengthy, easy-to-make-a-mistake >>>>>>>>>>>> procedure for resolving the issue. >>>>>>>>>>>> >>>>>>>>>>>> I decided life was too short for this kind of thing and switched >>>>>>>>>>>> to Amazon. >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin >>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> Does anyone have experience running Accumulo on top of Google >>>>>>>>>>>>> Cloud Storage instead of HDFS? In [1] you can see some details if >>>>>>>>>>>>> you never heard about this feature. >>>>>>>>>>>>> >>>>>>>>>>>>> I see some discussion (see [2], [3]) around this topic, but it >>>>>>>>>>>>> looks to me that this isn't as popular as, I believe, should be. >>>>>>>>>>>>> >>>>>>>>>>>>> [1]: >>>>>>>>>>>>> https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage >>>>>>>>>>>>> [2]: https://github.com/apache/accumulo/issues/428 >>>>>>>>>>>>> [3]: >>>>>>>>>>>>> https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103 >>>>>>>>>>>>> >>>>>>>>>>>>> Best regards, >>>>>>>>>>>>> Maxim >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> There are ways and there are ways, >>>>>>>>>>>> >>>>>>>>>>>> Geoffry Roberts >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> There are ways and there are ways, >>>>>>>>>> >>>>>>>>>> Geoffry Roberts >>>>>>>> >>>>>>>> >>>>>>
