Re: improving access to telemetry data

2013-02-28 Thread Josh Aas
On Thursday, February 28, 2013 8:16:50 AM UTC-6, Benjamin Smedberg wrote:

 Have we discussed this project with the metrics team yet?

When I started looking into this it wasn't clear to me that people really knew 
what they wanted - they just knew why the existing system didn't work for them. 
My first goal is to understand what people want, then see what we can build. 
Your questions are good steps towards understanding the latter, which I haven't 
really started on yet. I'm curious to know the answers. I have talked to 
metrics a bit, but mostly to get basic background information on how the system 
works. I haven't asked how to do anything else in particular yet.

FYI:

Taras has done some hacking on an experimental lightweight UI, it can be found 
here:

http://people.mozilla.org/~tglek/dashboard/

It works using JSON dumps of data, like this:

http://people.mozilla.org/~tglek/dashboard/data/DNS_CLEANUP_AGE.json
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: improving access to telemetry data

2013-02-28 Thread Benoit Jacob
Please, please make your plans include the ability to get raw text files
(CSV or JSON or something else, I don't care as long as I can easily parse
it). I don't have use for the current frond-end, and I believe that no
front-end will cover everyone's needs, and not every developer is familiar
with databases either (not even SQL) while almost everyone can easily do
things with raw text files.

As a second request, please make as much data as possible public.

With public data in the form of raw text files, a lot of things become
possible. Thankfully we have that for crash reports (
https://crash-analysis.mozilla.com/crash_analysis/ ) and that allows me to
make much more useful bugzilla comments than I otherwise could - because I
can link to a public CSV file and give a Bash command (with cut|grep|wc)
allowing to reproduce the result I'm claiming.

Benoit

2013/2/27 Josh Aas josh...@gmail.com

 I've been thinking about how we might improve access to telemetry data. I
 don't want to get too much into the issues with the current front-end, so
 suffice it to say that it isn't meeting my group's needs.

 I solicited ideas from Justin Lebar and Patrick McManus, and with those I
 came up with an overview of how those of us in platform engineering would
 ideally like to be able to access telemetry data. I'd love to hear thoughts
 and feedback.

 ===

 At the highest level, we want to make decisions, mostly regarding
 optimizations, based on feedback from the wild. We want to know when a
 change makes something better or worse. We also want to be able to spot
 regressions.

 The top priority for exposing telemetry data is an easy-to-use API.
 Secondarily, there should be a default front-end.

 An API will allow developers to innovate at the edges by building tools
 to fit their needs. This will also remove the one-size-fits-all requirement
 from the default front-end. The API should be easier to use than mucking
 with hadoop/hbase directly. It might be a RESTful JSON API. It should not
 require people to apply for special privileges.

 The default front-end should be fast, stable, and flexible. It should be
 based as much as possible on existing products and frameworks. Being able
 to modify the display of data as needs change should be easy to do, to
 avoid long wait times for new views of the data. It should provide
 generally useful views of the data, breaking down results by build (builds
 contain dates in their IDs), so we can see how results change from one
 build to another. We want to be able to see statistical analyses such as
 standard deviations, and cumulative distribution functions.

 We would also like to be able to run A/B experiments. This means coming up
 with a better instrumentation framework for code in the builds, but it also
 means having a dashboard that understands the results, and can show
 comparisons between A and B users that otherwise have the same builds.
 ___
 dev-platform mailing list
 dev-platform@lists.mozilla.org
 https://lists.mozilla.org/listinfo/dev-platform

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: improving access to telemetry data

2013-02-28 Thread Benjamin Smedberg

On 2/28/2013 10:33 AM, Benoit Jacob wrote:

Please, please make your plans include the ability to get raw text files
(CSV or JSON or something else, I don't care as long as I can easily parse
it).
Could you be more specific? Note that while the text files currently 
provided on crash-analysis, they are not the full dataset: they include 
a limited and specific set of fields. It looks to me like telemetry 
payloads typically include many more fields, and some of these are not 
single-value fields but rather more complex histograms and such. Putting 
all of that into text files may leave us with unworkably large text files.


Because the raw crash files do not include new metadata fields, this has 
led to weird engineering practices like shoving interesting metadata 
into the freeform app notes field, and then parsing that data back out 
later. I'm worried about perpetuating this kind of behavior, which is 
hard on the database and leads to very arcane queries in many cases.


What is the current volume of telemetry pings per day?

--BDS

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: improving access to telemetry data

2013-02-28 Thread Benoit Jacob
2013/2/28 Benjamin Smedberg benja...@smedbergs.us

 On 2/28/2013 10:33 AM, Benoit Jacob wrote:

 Please, please make your plans include the ability to get raw text files
 (CSV or JSON or something else, I don't care as long as I can easily parse
 it).

 Could you be more specific? Note that while the text files currently
 provided on crash-analysis, they are not the full dataset: they include a
 limited and specific set of fields.


I know; but that's already plenty enough data to do many useful things.

As to being more specific, here's an example of something I'm currently
doing with CSV crash report dumps:
http://people.mozilla.org/~bjacob/gfx_features_stats/
Obviously I would be very interested in the ability to do the same with
Telemetry instead.

Another example I mentioned above is bugzilla comments getting data from
dumps; here's a link just to give an example:
https://bugzilla.mozilla.org/show_bug.cgi?id=771774#c28


 It looks to me like telemetry payloads typically include many more fields,
 and some of these are not single-value fields but rather more complex
 histograms and such. Putting all of that into text files may leave us with
 unworkably large text files.


Good point; so I suppose that that would support using JSON instead of CSV,
as in Josh's second email in this thread, which I hadn't seen when I wrote
this.



 Because the raw crash files do not include new metadata fields, this has
 led to weird engineering practices like shoving interesting metadata into
 the freeform app notes field, and then parsing that data back out later.
 I'm worried about perpetuating this kind of behavior, which is hard on the
 database and leads to very arcane queries in many cases.


I don't agree with the notion that freeform fields are bad. freeform plain
text is an amazing file format. It allows to add any kind of data without
administrative overhead and is still easy to parse (if the data was that
was added was formatted with easy parsing in mind).

But if one considers it a bad thing that people use it, then one should
address the issues that are causing people to use it. As you mention, raw
crash files may not include newer metadata fields. So maybe that can be
fixed by making it easier or even automatable to include new fields in raw
crash files?

Related/similar conversation in
https://bugzilla.mozilla.org/show_bug.cgi?id=641461

Benoit




 What is the current volume of telemetry pings per day?

 --BDS


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: What data do you want from telemetry (was Re: improving access to telemetry data)

2013-02-28 Thread Patrick McManus
On Thu, Feb 28, 2013 at 10:36 AM, Benjamin Smedberg
benja...@smedbergs.us wrote:

 Cool. Perhaps we should start out with collecting stories/examples:


In that spirit:

What I almost always want to do is simply for the last N days of
variable X show me a CDF (at even just 10 percentile granularity) for
the histogram and let me break that down by sets of build id and or
OS. That's it.

For instance - what is my median time to ready for HTTP vs HTTPs
connections (I've got data for both of those)? What about their tails?
How did they change based on some checkin I'm interested in? Not
rocket science - but incredibly painful to even approximate in the
current front end.. you can kind of do it, but with a bunch of fudging
and manual addition required and it takes forever. I'll admit I get
frustrated with all the talk of EC and Hadoop and what-not when it
really seems a rather straightforward task for me to script on the
data.

Gimme the data set and I can just script it instead of spending an
hour laboriously clicking on things and waiting 15 seconds for every
click.

Reports from the front end seem to indicate that there are 60 Million
submissions in the last month across all channels for one of the
things I'm tracking.. 651K of those from nightly. fwiw.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: improving access to telemetry data

2013-02-28 Thread Jeff Muizelaar

On 2013-02-28, at 10:44 AM, Benjamin Smedberg wrote:

 On 2/28/2013 10:33 AM, Benoit Jacob wrote:
 Please, please make your plans include the ability to get raw text files
 (CSV or JSON or something else, I don't care as long as I can easily parse
 it).
 Could you be more specific? Note that while the text files currently provided 
 on crash-analysis, they are not the full dataset: they include a limited and 
 specific set of fields. It looks to me like telemetry payloads typically 
 include many more fields, and some of these are not single-value fields but 
 rather more complex histograms and such. Putting all of that into text files 
 may leave us with unworkably large text files.

I've also been using these text files for gathering cpu specific information:
https://github.com/jrmuizel/cpu-features

eg:

sse2 97.5126791126%
amd 30.9852560634%
coreavg 2.29447221529
coremax 32
mulicore 81.239118672%
windowsxp 34.260938801%
fourcore 19.3838329749%

('GenuineIntel', 6, 23) 16.8679157473 Core 2 Duo 45nm
('GenuineIntel', 6, 15) 11.4360128852 Core 2 Duo Allendale/Kentsfield 65nm
('GenuineIntel', 6, 42) 9.75864128134 Core i[735] Sandybridge
('AuthenticAMD', 20, 2) 7.62036395852 AMD64 C-60
('GenuineIntel', 6, 37) 6.61528670635 Core i[735] Westmere
('GenuineIntel', 15, 4) 6.41735517496 Pentium 4 Prescott 2M 90nm
('AuthenticAMD', 20, 1) 5.85243727729 AMD64 C-50
('AuthenticAMD', 16, 6) 4.60457148945 AMD64 Athlon II
('GenuineIntel', 15, 2) 3.34941643841 Pentium 4 Northwood 130nm
('GenuineIntel', 15, 6) 2.74496830572 Pentium D
('GenuineIntel', 6, 28) 2.56862420764 Atom
('AuthenticAMD', 15, 107) 1.78671055177 AMD64 X2
('GenuineIntel', 6, 22) 1.55990232387 Core based Celeron 65nm
('GenuineIntel', 6, 58) 1.47130974042 Core i[735] Ivybridge
('GenuineIntel', 6, 14) 1.18506598185 Core Duo 65nm
('GenuineIntel', 15, 3) 1.10796800574 Pentium 4 Prescott 90nm
('AuthenticAMD', 6, 8) 0.963864879489 Athlon (Palomino) XP/Duron
('GenuineIntel', 6, 13) 0.907232911584 Pentium M
('AuthenticAMD', 16, 5) 0.876674077418 Athlon II X4
('AuthenticAMD', 17, 3) 0.81163142121 
('AuthenticAMD', 18, 1) 0.7381780767 
('AuthenticAMD', 15, 44) 0.702012116998 
('AuthenticAMD', 16, 4) 0.670892570278 Athlon II X4
('AuthenticAMD', 16, 2) 0.621830221846 Athlon II X2
('GenuineIntel', 15, 1) 0.608373120562 Pentium 4 Willamette 180nm
('GenuineIntel', 6, 30) 0.578935711502 
('AuthenticAMD', 15, 127) 0.576692861288 
('AuthenticAMD', 6, 10) 0.545012602015 Athlon MP
('AuthenticAMD', 15, 104) 0.502959160501 
('AuthenticAMD', 15, 75) 0.486698496449 
('AuthenticAMD', 15, 47) 0.463148569202 
('AuthenticAMD', 15, 67) 0.423898690456 
('AuthenticAMD', 15, 95) 0.413805864493 
('GenuineIntel', 6, 54) 0.404834463636 
('AuthenticAMD', 15, 79) 0.327456131252 
('AuthenticAMD', 21, 16) 0.294654446871 
('GenuineIntel', 6, 26) 0.278954495373 
('GenuineIntel', 6, 8) 0.252881361634 Pentium III Coppermine 0.18 um
('AuthenticAMD', 21, 1) 0.234938559922 
('AuthenticAMD', 16, 10) 0.201576162988 
('AuthenticAMD', 15, 72) 0.163167353072 
('GenuineInte', 0, 0) 0.159242365198 
('AuthenticAMD', 6, 6) 0.150270964341 Athlon XP
('AuthenticAMD', 15, 76) 0.132608518906 
('GenuineIntel', 6, 9) 0.130926381245 
('AuthenticAMD', 15, 12) 0.105974672614 
('GenuineIntel', 6, 7) 0.102610397293 Pentium III Katmai 0.25 um
('GenuineIntel', 6, 44) 0.0986854094183 
('AuthenticAMD', 15, 124) 0.0964425592042 
('AuthenticAMD', 15, 4) 0.0939193527134 
('GenuineIntel', 6, 11) 0.0894336522853 Pentium III Tualatin 0.13 um
('CentaurHauls', 6, 13) 0.0832658141967 
('AuthenticAMD', 15, 28) 0.0827051016432 
('AuthenticAMD', 15, 36) 0.0695283566356 
('AuthenticAMD', 6, 7) 0.055230186521 Duron Morgan
('AuthenticAMD', 15, 43) 0.0521462674767 
('GenuineIntel', 6, 45) 0.0428945103437 
('AuthenticAM', 0, 0) 0.0420534415135 
('GenuineIntel', 15, 0) 0.0392498787459 
('AuthenticAMD', 15, 35) 0.0361659597016 
('AuthenticAMD', 6, 4) 0.0361659597016 Athlon
('AuthenticAMD', 15, 31) 0.0299981216129 
('AuthenticAMD', 6, 3) 0.0297177653362 Athlon Duron
('AuthenticAMD', 15, 39) 0.0260731337384 
('AuthenticAMD', 21, 2) 0.0179428017124 
('AuthenticAMD', 15, 15) 0.0176624454357 
('CentaurHauls', 6, 10) 0.0159803077751 
('GenuineIntel', 6, 10) 0.0117749636238 
('AuthenticAMD', 15, 63) 0.0117749636238 
('AuthenticAMD', 15, 55) 0.011494607347 
('CentaurHauls', 6, 15) 0.01037318224 
('GenuineIntel', 6, 6) 0.00953211340972 Pentium II Mendocino 0.25 um
('CentaurHauls', 6, 9) 0.00728926319567 
('GenuineIntel', 6, 5) 0.00672855064216 Pentium II Deschutes 0.25 um
('CentaurHauls', 6, 7) 0.00532676925837 VIA Ezra/Samuel 2
('AuthenticAMD', 15, 108) 0.00364463159783 
('AuthenticAMD', 15, 33) 0.00364463159783 
('GenuineInte', 6, 15) 0.00336427532108 
('AuthenticAMD', 15, 7) 0.00336427532108 
('GenuineIntel', 6, 53) 0.00308391904432 
('AuthenticAMD', 15, 8) 0.00308391904432 
('GenuineIntel', 6, 47) 0.00280356276757 
('AuthenticAMD', 16, 8) 0.00280356276757 
('GenuineIntel', 6, 46) 0.00224285021405 
('GenuineInte', 6, 23) 0.00224285021405 

Re: improving access to telemetry data

2013-02-28 Thread Justin Lebar
It sounds to me like people want both

1) Easier access to aggregated data so they can build their own
dashboards roughly comparable in features to the current dashboards.

2) Easier access to raw databases so that people can build up more
complex analyses, either by exporting the raw data from the db, or by
analyzing it in the db.

That is, I don't think we can or should export JSON with all the data
in our databases.  That is a lot of data.

On Thu, Feb 28, 2013 at 12:08 PM, Benjamin Smedberg
benja...@smedbergs.us wrote:
 On 2/28/2013 10:59 AM, Benoit Jacob wrote:

 Because the raw crash files do not include new metadata fields, this has
 led to weird engineering practices like shoving interesting metadata into
 the freeform app notes field, and then parsing that data back out later.
 I'm worried about perpetuating this kind of behavior, which is hard on
 the
 database and leads to very arcane queries in many cases.

 I don't agree with the notion that freeform fields are bad. freeform plain
 text is an amazing file format. It allows to add any kind of data without
 administrative overhead and is still easy to parse (if the data was that
 was added was formatted with easy parsing in mind).

 The obvious disadvantage is that it is much more difficult to
 machine-process. For example elasticsearch can't index on it (at least not
 without lots of custom parsing), and in general you can't ask tools like
 hbase or elasticsearch to filter on that without a user defined function.
 (Regexes might work for some kinds of text processing.)


 But if one considers it a bad thing that people use it, then one should
 address the issues that are causing people to use it. As you mention, raw
 crash files may not include newer metadata fields. So maybe that can be
 fixed by making it easier or even automatable to include new fields in raw
 crash files?

 Yes, that is all filed. We can't automatically include the field, because we
 don't know whether they are supposed to be public or private, but we should
 soon be able to have a dynamically updateable list.

 Note that if mcmanus is correct, we're going to be dealing with 1M fields
 per day here. That's a lot more than the 250k from crash-stats, especially
 because the payload is bigger. I believe that the flat files from
 crash-stats are a really useful kludge because we couldn't figure out a
 better way to expose the raw data. But that kludge will start to fall over
 pretty quickly, and perhaps we should just expose a better way to do queries
 using the databases, which are surprisingly good at doing these kinds of
 queries efficiently.


 --BDS

 ___
 dev-platform mailing list
 dev-platform@lists.mozilla.org
 https://lists.mozilla.org/listinfo/dev-platform
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: improving access to telemetry data(Help Wanted)

2013-02-28 Thread Taras Glek



Justin Lebar wrote:

It sounds to me like people want both

1) Easier access to aggregated data so they can build their own
dashboards roughly comparable in features to the current dashboards.


I doubt people actually want to build own dashboards. I suspect this is 
mainly a need because of deficiencies in the current dashboard.




2) Easier access to raw databases so that people can build up more
complex analyses, either by exporting the raw data from the db, or by
analyzing it in the db.

That is, I don't think we can or should export JSON with all the data
in our databases.  That is a lot of data.


From concrete examples I've seen so far, people want basic 
aggregations. My FE in http://people.mozilla.org/~tglek/dashboard/ works 
on aggregated histogram JSONs. It seems completely reasonable to 
aggregate all of the other info + simple_measurement fields(and is on my 
TODO). This would solve all of the other concrete use-cases mentioned 
(flash versions, hardware stats)


I think we can be more aggressive still. We can also allow filtering 
certain histograms by one of those highly variable info fields(eg TAB 
animations vs gfx hardware, specific chromehangs vs something useful, 
etc) without unreasonable overhead overhead.


I like my aggregated JSON approach because it's cheap on server CPU and 
as long as one partitions JSON carefully, it can be compact-enough for 
gzip encoding to make it fast-enough to download. This should also make 
it easy to fork the dashboards, contribute, etc.


I hope to feed more data into my frontend by end of today and will aim 
for a live-ish dashboard by end of next week.


For advanced use-cases, we can stick with hadoop querying.

==Help wanted==

If anyone knows a dev who is equally good at stats  programming, let me 
know. I think we have a lot of useful data, we can handle some 
visualizations of that data, but a person skilled at extracting signal 
out of noisy sources could help us squeeze the most use out of our data.



I spend too much time on management to make quick progress. I wrote up 
the prototype to prove to myself that the json schema is feasible.


If someone wants to help with aggregations, I can hook you up with raw 
json dumps from hadoop. For everything else, the code is on 
github(https://github.com/tarasglek/telemetry-frontend).
Help wanted: UX improvements such as easier-to-use selectors, 
incremental search, switching to superior charting such as flotcharts.org




On Thu, Feb 28, 2013 at 12:08 PM, Benjamin Smedberg
benja...@smedbergs.us  wrote:

On 2/28/2013 10:59 AM, Benoit Jacob wrote:

Because the raw crash files do not include new metadata fields, this has
led to weird engineering practices like shoving interesting metadata into
the freeform app notes field, and then parsing that data back out later.
I'm worried about perpetuating this kind of behavior, which is hard on
the
database and leads to very arcane queries in many cases.


I don't agree with the notion that freeform fields are bad. freeform plain
text is an amazing file format. It allows to add any kind of data without
administrative overhead and is still easy to parse (if the data was that
was added was formatted with easy parsing in mind).

The obvious disadvantage is that it is much more difficult to
machine-process. For example elasticsearch can't index on it (at least not
without lots of custom parsing), and in general you can't ask tools like
hbase or elasticsearch to filter on that without a user defined function.
(Regexes might work for some kinds of text processing.)


But if one considers it a bad thing that people use it, then one should
address the issues that are causing people to use it. As you mention, raw
crash files may not include newer metadata fields. So maybe that can be
fixed by making it easier or even automatable to include new fields in raw
crash files?

Yes, that is all filed. We can't automatically include the field, because we
don't know whether they are supposed to be public or private, but we should
soon be able to have a dynamically updateable list.

Note that if mcmanus is correct, we're going to be dealing with 1M fields
per day here. That's a lot more than the 250k from crash-stats, especially
because the payload is bigger. I believe that the flat files from
crash-stats are a really useful kludge because we couldn't figure out a
better way to expose the raw data. But that kludge will start to fall over
pretty quickly, and perhaps we should just expose a better way to do queries
using the databases, which are surprisingly good at doing these kinds of
queries efficiently.


--BDS

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: improving access to telemetry data

2013-02-28 Thread Bill McCloskey
- Original Message -
 From: Justin Lebar justin.le...@gmail.com
 To: Benjamin Smedberg benja...@smedbergs.us
 Cc: Benoit Jacob jacob.benoi...@gmail.com, Josh Aas 
 josh...@gmail.com, dev-platform@lists.mozilla.org
 Sent: Thursday, February 28, 2013 9:14:52 AM
 Subject: Re: improving access to telemetry data
 
 It sounds to me like people want both
 
 1) Easier access to aggregated data so they can build their own
 dashboards roughly comparable in features to the current dashboards.
 
 2) Easier access to raw databases so that people can build up more
 complex analyses, either by exporting the raw data from the db, or by
 analyzing it in the db.
 
 That is, I don't think we can or should export JSON with all the data
 in our databases.  That is a lot of data.

I've used telemetry data a little bit for finding information about addon 
usage. It took me a while to figure out how to use Pig and run Hadoop jobs, and 
it would be great to have something easier to use. Based on what little I know, 
it seems like a lot of queries fit the following scheme:

1. Filter based on version and/or buildid as well as the product 
(Firefox/TB/Fennec).
2. Select a random sample of x% of all pings.
3. Dump out the JSON and process it in Python or via some other external tool.

This, at least, was sufficient for what I was doing. It sounds like it would 
also work for many of the applications people have suggested so far as well, 
although I might be misunderstanding.

It sounds like we might be able to come up with a few generic queries that 
could run each day. One could be for Nightly data with yesterday's buildid and 
another could be for recent Aurora submissions, etc. The data would be randomly 
sampled to generate a compressed JSON file of some reasonable size (maybe 
100MB) that would then be uploaded to an FTP server that everyone could access. 
The old files would be thrown away after a few weeks, although we could archive 
a few in case someone wants older data.

I'm sure that this wouldn't cover every single use case of telemetry. However, 
it could be used both for dashboards and to get the raw data. The random 
sampling seems like the biggest potential problem. However, you could compare 
data across a few days to see how significant the results are. At the very 
least, this data would make it easy to try out prototypes. Once you find 
something that works, you could create a more customized query that would be 
more specific to the application.

-Bill
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: improving access to telemetry data

2013-02-28 Thread lauraxt
On Wednesday, February 27, 2013 12:52:10 PM UTC-5, Josh Aas wrote:
 I've been thinking about how we might improve access to telemetry data. I 
 don't want to get too much into the issues with the current front-end, so 
 suffice it to say that it isn't meeting my group's needs.
 
 

A few people have pinged me and asked me to respond on this thread, based on 
experiences with Socorro.

I think it would be good to work out the use cases that are needed, but here 
are possible some options for opening up the data:
- Open up a reporting instance of HBase for adhoc queries (perhaps via Pig - 
this is easy to learn and working well for Socorro. There are newer options 
like Impala, too)
- Enable searches/faceting through ElasticSearch - we have some generic UI for 
this in the pipeline, which may be able to be reused.
- Create more JSON/CSV dumps and make them available.  Many of the Socorro 
reports are created based on prototypes somebody in Engineering made with CSV 
data and a script.
- Consider dumping some of the data into a relational DB. This is a common 
pattern (the so-called data mullet) which makes querying accessible to a 
greater number of people.
- Build a simple API in front of one or more of these data sources to make it 
easier for people to write their own front ends and reporting.
- Finally, work on the front end to support the most commonly needed queries 
and reports. This could fall out of the work done on some of the other options.

Cheers

Laura
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: improving access to telemetry data(Help Wanted)

2013-02-28 Thread Robert Kaiser

Taras Glek schrieb:

I doubt people actually want to build own dashboards. I suspect this is
mainly a need because of deficiencies in the current dashboard.


I disagree. I think people will want to integrate Telemetry data in 
dashboards that connect data from different sources, and not just 
Telemetry. That might be combinations with FHR data, with crash data, or 
even other things.
I for example would love to have stability-related data from all those 
sources be trimmed down by a dashboard to digestible this channel looks 
good/bad values.


Robert Kaiser
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: LOAD_ANONYMOUS + LOAD_NOCOOKIES

2013-02-28 Thread bernhardredl
i have run hg pull and hg update on mozilla central and still have no 
LOAD_DOCUMENT_URI. But i can skip 18 if you want.

i have a first draft of my patch: (see link below)
The problem seems to be that this patch breaks the WHOLE Cookie handling of 
firefox. (at least it fixes my original sync bug ;))

Maybe someone can point me to the part where i broke cookies.
i have attached the patch here:
https://bugzilla.mozilla.org/attachment.cgi?id=719791action=diff

I also seek feedback where to put testcases and what they should cover.

-Bernhard
Am Samstag, 23. Februar 2013 03:24:33 UTC+1 schrieb Boris Zbarsky:
 On 2/22/13 7:25 PM, bernhardr...@gmail.com wrote:
 
  const unsigned long LOAD_NOCOOKIES = 1  15;
 
  ... just stop sending / accepting cookies at this request
 
 
 
 115 is LOAD_FRESH_CONNECTION, no?
 
 
 
  const unsigned long LOAD_NOAUTH  = 1  16;
 
  ... don't add authentification headers automatically
 
 
 
 116 is LOAD_DOCUMENT_URI.
 
 
 
 So at the very least you'll need different values for these constants.
 
 
 
 -Boris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: LOAD_ANONYMOUS + LOAD_NOCOOKIES

2013-02-28 Thread Boris Zbarsky

On 2/28/13 9:37 PM, bernhardr...@gmail.com wrote:

i have run hg pull and hg update on mozilla central and still have no 
LOAD_DOCUMENT_URI.


Are you looking in the right file?


The problem seems to be that this patch breaks the WHOLE Cookie handling of 
firefox.


Because your LOAD_NO_COOKIES has the same value as LOAD_DOCUMENT_URI. 
Which means it's getting set for every web page load.


Oh, and your LOAD_NOAUTH_HEADER has the same value as 
LOAD_RETARGETED_DOCUMENT_URI which will lead to subtle bugs of its own, 
of course.


You really can't introduce new flags to a bitfield that have the same 
values as existing flags.   It just doesn't work well.  ;)


On a more serious note, I believe at this point all the flags except 
(125) are in use on HTTP channel, between nsIRequest, nsIChannel, and 
nsICachingChannel


-Boris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: LOAD_ANONYMOUS + LOAD_NOCOOKIES

2013-02-28 Thread bernhardredl
Yep, you are right. I assumed nsIRequest would be the only file assigning these 
values.

What numbers should i choose? I need 2 flags and unsigned long only provides 32 
possibility which are already used (except 25)

For me it would be ok to just fix the cookies issue :) But i guess there is a 
reason why 25 is not used.

-Bernhard
Am Freitag, 1. März 2013 03:59:08 UTC+1 schrieb Boris Zbarsky:
 On 2/28/13 9:37 PM, bernhardr...@gmail.com wrote:
 
  i have run hg pull and hg update on mozilla central and still have no 
  LOAD_DOCUMENT_URI.
 
 
 
 Are you looking in the right file?
 
 
 
  The problem seems to be that this patch breaks the WHOLE Cookie handling of 
  firefox.
 
 
 
 Because your LOAD_NO_COOKIES has the same value as LOAD_DOCUMENT_URI. 
 
 Which means it's getting set for every web page load.
 
 
 
 Oh, and your LOAD_NOAUTH_HEADER has the same value as 
 
 LOAD_RETARGETED_DOCUMENT_URI which will lead to subtle bugs of its own, 
 
 of course.
 
 
 
 You really can't introduce new flags to a bitfield that have the same 
 
 values as existing flags.   It just doesn't work well.  ;)
 
 
 
 On a more serious note, I believe at this point all the flags except 
 
 (125) are in use on HTTP channel, between nsIRequest, nsIChannel, and 
 
 nsICachingChannel
 
 
 
 -Boris

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: LOAD_ANONYMOUS + LOAD_NOCOOKIES

2013-02-28 Thread bernhardredl
just to keep this thread up to date. I asked jduell if it is possible to change 
long to int64_t
Am Freitag, 1. März 2013 04:11:40 UTC+1 schrieb bernha...@gmail.com:
 Yep, you are right. I assumed nsIRequest would be the only file assigning 
 these values.
 
 
 
 What numbers should i choose? I need 2 flags and unsigned long only provides 
 32 possibility which are already used (except 25)
 
 
 
 For me it would be ok to just fix the cookies issue :) But i guess there is a 
 reason why 25 is not used.
 
 
 
 -Bernhard
 
 Am Freitag, 1. März 2013 03:59:08 UTC+1 schrieb Boris Zbarsky:
 
  On 2/28/13 9:37 PM, bernhardr...@gmail.com wrote:
 
  
 
   i have run hg pull and hg update on mozilla central and still have no 
   LOAD_DOCUMENT_URI.
 
  
 
  
 
  
 
  Are you looking in the right file?
 
  
 
  
 
  
 
   The problem seems to be that this patch breaks the WHOLE Cookie handling 
   of firefox.
 
  
 
  
 
  
 
  Because your LOAD_NO_COOKIES has the same value as LOAD_DOCUMENT_URI. 
 
  
 
  Which means it's getting set for every web page load.
 
  
 
  
 
  
 
  Oh, and your LOAD_NOAUTH_HEADER has the same value as 
 
  
 
  LOAD_RETARGETED_DOCUMENT_URI which will lead to subtle bugs of its own, 
 
  
 
  of course.
 
  
 
  
 
  
 
  You really can't introduce new flags to a bitfield that have the same 
 
  
 
  values as existing flags.   It just doesn't work well.  ;)
 
  
 
  
 
  
 
  On a more serious note, I believe at this point all the flags except 
 
  
 
  (125) are in use on HTTP channel, between nsIRequest, nsIChannel, and 
 
  
 
  nsICachingChannel
 
  
 
  
 
  
 
  -Boris

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: improving access to telemetry data(Help Wanted)

2013-02-28 Thread Taras Glek



Robert Kaiser wrote:

Taras Glek schrieb:

I doubt people actually want to build own dashboards. I suspect this is
mainly a need because of deficiencies in the current dashboard.


I disagree. I think people will want to integrate Telemetry data in
dashboards that connect data from different sources, and not just
Telemetry. That might be combinations with FHR data, with crash data, or
even other things.
I for example would love to have stability-related data from all those
sources be trimmed down by a dashboard to digestible this channel looks
good/bad values.
You are correct. There is a valid use-case for integrating subsets of 
telemetry data into wikis, other dashboards, etc.


Taras


Robert Kaiser

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: LOAD_ANONYMOUS + LOAD_NOCOOKIES

2013-02-28 Thread Jason Duell

On 02/28/2013 07:29 PM, bernhardr...@gmail.com wrote:

just to keep this thread up to date. I asked jduell if it is possible to change 
long to int64_t


We're going to upgrade to 64 bits in

   https://bugzilla.mozilla.org/show_bug.cgi?id=846629

Jason

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform