Look at the record_version field to know if the new column is populated.
https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequest#Changes_and_known_problems_since_2015-03-04
On Sun, Apr 12, 2015 at 10:43 AM, Toby Negrin wrote:
> Hi Yuri --
>
> In general, I do not think this table will chan
Hi Yuri --
In general, I do not think this table will change a lot moving forward.
We're migrating to a more complete definition right now so some changes are
to be expected but things should settle down.
Thanks for the new fields!
-Toby
On Sun, Apr 12, 2015 at 9:55 AM, Andrew Otto wrote:
> Y
You probably have to do it conditionally by date
> On Apr 12, 2015, at 12:38, Yuri Astrakhan wrote:
>
> Thanks Oliver! Is there a way to handle it in hql? E.g if(
> exists(is_pageview),is_pageview,null)? Finding out if field exists by
> observing query crash seems wrong ))
>
>> On Apr 12, 2
Thanks Oliver! Is there a way to handle it in hql? E.g if(
exists(is_pageview),is_pageview,null)? Finding out if field exists by
observing query crash seems wrong ))
On Apr 12, 2015 06:53, "Oliver Keyes" wrote:
> (Duplicated from bug):
>
> That's not a bug. The complexity of regenerating ~60 day
(Duplicated from bug):
That's not a bug. The complexity of regenerating ~60 days of data,
where a day is 24*60*125000 rows, is extreme, and adding new fields
means doing just that - regenerating the entire thing. As such, the
decision was made to add to the field definition and only add actual
val
I tried to move Zero analytics to the new table, and decided to test the
new wonderful fields like agent_type ... and it only works on the most
recent hours of data ((
https://phabricator.wikimedia.org/T95806
On Fri, Apr 10, 2015 at 8:51 PM, Yuri Astrakhan
wrote:
> Please clarify why the field
Please clarify why the field "is_zero" is needed, as it is nothing more
than a test for ("zero=" in x_analytics). Does having this field
significantly improve performance for zero queries, e.g. "select count(*)
from requests where iszero = true" ? Because otherwise it simply identifies
"zero partne
Cool!
On 10 April 2015 at 17:12, Joseph Allemandou wrote:
> Yes Oliver, the agent_type = spider includes IsCrawler UDF.
>
> On Fri, Apr 10, 2015 at 11:08 PM, Oliver Keyes wrote:
>>
>> What does agent-type add? In the sense that if we're pre-parsing the
>> user agent, surely the difference is bet
Yes Oliver, the agent_type = spider includes IsCrawler UDF.
On Fri, Apr 10, 2015 at 11:08 PM, Oliver Keyes wrote:
> What does agent-type add? In the sense that if we're pre-parsing the
> user agent, surely the difference is between "WHERE agent_type !=
> 'spider'" and "WHERE user_agent_map['devi
What does agent-type add? In the sense that if we're pre-parsing the
user agent, surely the difference is between "WHERE agent_type !=
'spider'" and "WHERE user_agent_map['device_family'] != 'Spider'"?
Does agent_type include the isCrawler UDF results?
On 10 April 2015 at 16:47, Joseph Allemandou
And I forgot one field :
- is_zero - True if a request is made on a zero provider.
On Fri, Apr 10, 2015 at 10:36 PM, Leila Zia wrote:
> Hi Joseph,
>
>Thanks for the update, and for doing this. These three items make the
> analysis of the data much easier on our end. We've had many reque
Hi Joseph,
Thanks for the update, and for doing this. These three items make the
analysis of the data much easier on our end. We've had many requests in the
past that required agent_type and access_method information and having them
readily available is awesome! :-)
Have a great weekend!
Leil
Hi Analytics people,
Today happens another bunch of addition to the refined webrequest table in
hive.
Now the table contains:
- ts - The unix timestamp (milliseconds) version of the dt date
- access_method - The method used to access the site, being one of the
three [mobile app | mobile
13 matches
Mail list logo