> I'm not convinced this is a write pattern issue, though. I commented on..
The note there suggests the need for a LogCloser implementation; in my (ADLS) case I've written one and have it configured - the exception I'm seeing involves failures during writes, not during recovery (though it then leads to a need for recovery). S. On Fri, Jun 22, 2018 at 4:33 PM, Christopher <[email protected]> wrote: > Unfortunately, that feature wasn't added until 2.0, which hasn't yet been > released, but I'm hoping it will be later this year. > > However, I'm not convinced this is a write pattern issue, though. I > commented on https://github.com/GoogleCloudPlatform/bigdata- > interop/issues/103#issuecomment-399608543 > > On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles <[email protected]> wrote: > >> Knowing that HBase has been run successfully on ADLS, went looking there >> (as they have the same WAL write pattern). This is informative: >> >> https://www.cloudera.com/documentation/enterprise/5-12- >> x/topics/admin_using_adls_storage_with_hbase.html >> >> which suggests a need to split the WALs off on HDFS proper versus ADLS >> (or presumably GCS) barring changes in the underlying semantics of each. >> AFAICT you can't currently configure Accumulo to send WAL logs to a >> separate cluster - is this correct? >> >> S. >> >> >> On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles <[email protected]> >> wrote: >> >>> > Did you try to adjust any Accumulo properties to do bigger writes >>> less frequently or something like that? >>> >>> We're using BatchWriters and sending reasonable larges batches of >>> Mutations. Given the stack traces in both our cases are related to WAL >>> writes it seems like batch size would be the only tweak available here >>> (though, without reading the code carefully it's not even clear to me that >>> is impactful) but if there others have suggestions I'd be happy to try. >>> >>> Given we have this working well and stable in other clusters atop >>> traditional HDFS I'm currently pursuing this further with the MS to >>> understand the variance to ADLS. Depending what emerges from that I may >>> circle back with more details and a bug report and start digging in more >>> deeply to the relevant code in Accumulo. >>> >>> S. >>> >>> >>> On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin <[email protected]> >>> wrote: >>> >>>> > If somebody is interested in using Accumulo on GCS, I'd like to >>>> encourage them to submit any bugs they encounter, and any patches (if they >>>> are able) which resolve those bugs. >>>> >>>> I'd like to contribute a fix, but I don't know where to start. We tried >>>> to get any help from the Google Support about [1] over email, but they just >>>> say that the GCS doesn't support such write pattern. In the end, we can >>>> only guess how to adjust the Accumulo behaviour to minimise broken >>>> connections to the GCS. >>>> >>>> BTW although we observe this exception, the tablet server doesn't fail, >>>> so it means that after some retries it is able to write WALs to GCS. >>>> >>>> @Stephen, >>>> >>>> > as discussions with MS engineers have suggested, similar to the GCS >>>> thread, that small writes at high volume are, at best, suboptimal for ADLS. >>>> >>>> Did you try to adjust any Accumulo properties to do bigger writes less >>>> frequently or something like that? >>>> >>>> [1]: https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103 >>>> >>>> Maxim >>>> >>>> On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <[email protected]> >>>> wrote: >>>> >>>>> I think we're seeing something similar but in our case we're trying to >>>>> run Accumulo atop ADLS. When we generate sufficient write load we start to >>>>> see stack traces like the following: >>>>> >>>>> [log.DfsLogger] ERROR: Failed to write log entries >>>>> java.io.IOException: attempting to write to a closed stream; >>>>> at com.microsoft.azure.datalake.store.ADLFileOutputStream. >>>>> write(ADLFileOutputStream.java:88) >>>>> at com.microsoft.azure.datalake.store.ADLFileOutputStream. >>>>> write(ADLFileOutputStream.java:77) >>>>> at org.apache.hadoop.fs.adl.AdlFsOutputStream.write( >>>>> AdlFsOutputStream.java:57) >>>>> at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write( >>>>> FSDataOutputStream.java:48) >>>>> at java.io.DataOutputStream.write(DataOutputStream.java:88) >>>>> at java.io.DataOutputStream.writeByte(DataOutputStream.java:153) >>>>> at org.apache.accumulo.tserver.logger.LogFileKey.write( >>>>> LogFileKey.java:87) >>>>> at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537) >>>>> >>>>> We have developed a rudimentary LogCloser implementation that allows >>>>> us to recover from this but overall performance is significantly impacted >>>>> by this. >>>>> >>>>> > As for the WAL closing issue on GCS, I recall a previous thread >>>>> about that >>>>> >>>>> I searched more for this but wasn't able to find anything, nor similar >>>>> re: ADL. I am also curious about the earlier question: >>>>> >>>>> >> Does Accumulo have a specific write pattern [to WALs], so that file >>>>> system may not support it? >>>>> >>>>> as discussions with MS engineers have suggested, similar to the GCS >>>>> thread, that small writes at high volume are, at best, suboptimal for >>>>> ADLS. >>>>> >>>>> Regards >>>>> >>>>> Stephen >>>>> >>>>> >>>>> On Wed, Jun 20, 2018 at 11:20 AM, Christopher <[email protected]> >>>>> wrote: >>>>> >>>>>> For what it's worth, this is an Apache project, not a Sqrrl project. >>>>>> Amazon is free to contribute to Accumulo to improve its support of their >>>>>> platform, just as anybody is free to do. Amazon may start contributing >>>>>> more >>>>>> as a result of their acquisition... or they may not. There is no reason >>>>>> to >>>>>> expect that their acquisition will have any impact whatsoever on the >>>>>> platforms Accumulo supports, because Accumulo is not, and has not ever >>>>>> been, a Sqrrl project (although some Sqrrl employees have contributed), >>>>>> and >>>>>> thus will not become an Amazon project. It has been, and will remain, a >>>>>> vendor-neutral Apache project. Regardless, we welcome contributions from >>>>>> anybody which would improve Accumulo's support of any additional platform >>>>>> alternatives to HDFS, whether it be GCS, S3, or something else. >>>>>> >>>>>> As for the WAL closing issue on GCS, I recall a previous thread about >>>>>> that... I think a simple patch might be possible to solve that issue, but >>>>>> to date, nobody has contributed a fix. If somebody is interested in using >>>>>> Accumulo on GCS, I'd like to encourage them to submit any bugs they >>>>>> encounter, and any patches (if they are able) which resolve those bugs. >>>>>> If >>>>>> they need help submitting a fix, please ask on the dev@ list. >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Maxim, >>>>>>> >>>>>>> Interesting that you were able to run A on GCS. I never thought of >>>>>>> that--good to know. >>>>>>> >>>>>>> Since I am now an AWS guy (at least or the time being), in light of >>>>>>> the fact that Amazon purchased Sqrrl, I am interested to see what >>>>>>> develops. >>>>>>> >>>>>>> >>>>>>> On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <[email protected] >>>>>>> > wrote: >>>>>>> >>>>>>>> Hi Geoffry, >>>>>>>> >>>>>>>> Thank you for the feedback! >>>>>>>> >>>>>>>> Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs >>>>>>>> and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop >>>>>>>> jobs >>>>>>>> on Accumulo. Almost everything was good until I've not faced some >>>>>>>> connection issues with GCS. Quite often, the connection to GCS breaks >>>>>>>> on >>>>>>>> writing or closing WALs. >>>>>>>> >>>>>>>> To all, >>>>>>>> >>>>>>>> Does Accumulo have a specific write pattern, so that file system >>>>>>>> may not support it? Are there Accumulo properties which I can play >>>>>>>> with to >>>>>>>> adjust the write pattern? >>>>>>>> >>>>>>>> [1]: https://github.com/cybermaggedon/accumulo-gs >>>>>>>> [2]: https://github.com/cybermaggedon/accumulo-docker >>>>>>>> >>>>>>>> Thank you! >>>>>>>> Maxim >>>>>>>> >>>>>>>> On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> I tried running Accumulo on Google. I first tried running it on >>>>>>>>> Google's pre-made Hadoop. I found the various file paths one must >>>>>>>>> contend >>>>>>>>> with are different on Google than on a straight download from Apache. >>>>>>>>> It >>>>>>>>> seems they moved things around. To counter this, I installed my own >>>>>>>>> Hadoop >>>>>>>>> along with Zookeeper and Accumulo on a Google node. All went well >>>>>>>>> until >>>>>>>>> one fine day when I could no longer log in. It seems Google had >>>>>>>>> pushed out >>>>>>>>> some changes over night that broke my client side Google Cloud >>>>>>>>> installation. Google referred the affected to a lengthy, >>>>>>>>> easy-to-make-a-mistake procedure for resolving the issue. >>>>>>>>> >>>>>>>>> I decided life was too short for this kind of thing and switched >>>>>>>>> to Amazon. >>>>>>>>> >>>>>>>>> On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> Does anyone have experience running Accumulo on top of Google >>>>>>>>>> Cloud Storage instead of HDFS? In [1] you can see some details if >>>>>>>>>> you never >>>>>>>>>> heard about this feature. >>>>>>>>>> >>>>>>>>>> I see some discussion (see [2], [3]) around this topic, but it >>>>>>>>>> looks to me that this isn't as popular as, I believe, should be. >>>>>>>>>> >>>>>>>>>> [1]: https://cloud.google.com/dataproc/docs/concepts/ >>>>>>>>>> connectors/cloud-storage >>>>>>>>>> [2]: https://github.com/apache/accumulo/issues/428 >>>>>>>>>> [3]: https://github.com/GoogleCloudPlatform/bigdata- >>>>>>>>>> interop/issues/103 >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Maxim >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> There are ways and there are ways, >>>>>>>>> >>>>>>>>> Geoffry Roberts >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> There are ways and there are ways, >>>>>>> >>>>>>> Geoffry Roberts >>>>>>> >>>>>> >>>>> >>> >>
