[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-10 Thread Manuel
Manuel closed this task as "Resolved".
Manuel added a comment.


  Thank you again, @dcausse, for all of your support! :)

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  @Manuel, @dcausse: the metrics increased, but only by a very marginal amount 
where we're now over 50%. Let me know if anything else is needed!

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Thanks a lot for this, @dcausse! The reasoning of singe column, relatively 
few rows for caching makes a lot of sense. I think that the problems I faced 
were from trying to cache `df_wikidata_rdf`. Just ran things through again with 
just `sa_and_sasc_ids` cached and it did seem to run through a bit better. With 
that being said, I did end up running the notebook multiple times and saving 
the outputs to variables as I went along before then restarting the kernel.
  
  Will update the task with the final values now!

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-10 Thread dcausse
dcausse added a comment.


  In T342123#9081490 , 
@AndrewTavis_WMDE wrote:
  
  > Minor question on this, @dcausse: why aren't we caching `df_wikidata_rdf` 
and `sa_and_sasc_ids` above? My assumption is that we should given that we're 
using them in multiple later calculations, but then I just tried to cache them 
and then a calculation that normally would finish then lost resources and 
stalled with three separate stages running. Did you explicitly choose not to 
cache them, and if so why not? :)
  
  I don't remember having such problems nor thinking too much about what to 
cache. Generally speaking caching comes with an extra cost and it's not always 
obvious that you'll get a net benefit but here I tend to agree that 
`sa_and_sasc_ids` might sound like a good candidate for caching (single column, 
relatively few rows) and I'm not sure to understand why it could fail... have 
you tried multiple times? Might possibly be unrelated to caching. If your 
notebook has had its kernel open for a long time (several days) and that the 
spark session was still open during that time I would not be surprised that 
hadoop had tried to cleanup some things in the meantime making spark unhappy... 
just making random guesses here. If after retrying on a fresh spark session (by 
killing your kernel) it still does not work please feel free to upload your 
code somewhere and I'll give it a try.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, dcausse
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-09 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Minor question on this, @dcausse: why aren't we caching `df_wikidata_rdf` and 
`sa_and_sasc_ids` above? My assumption is that we should given that we're using 
them in multiple later calculations, but then I just tried to cache them and 
then a calculation that normally would finish then lost resources and stalled 
with three separate stages running. Did you explicitly choose not to cache 
them, and if so why not? :)

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-09 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-09 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Is what we were thinking too, @dcausse :) I'm realizing that where I had the 
`.distinct()` was incorrect though. I think that it should go at the end of the 
full definition of the PySpark df like this:
  
# Got rid of the sa_and_... because it was getting too verbose
df_sasc_ids = (
df_wikidata_rdf.select(col("subject").alias("distinct_sasc_qids"))
.where(col("predicate") == P31_DIRECT_URL)
.where(col("object").isin(sasc_qids))
.alias("df_sasc_ids")
).distinct()
  
  Thanks for checking in!

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-09 Thread dcausse
dcausse added a comment.


  At a glance I suspect that now you might get duplicated QIDs in
  
sa_and_sasc_ids = (
df_wikidata_rdf.select(col("subject").alias("sa_and_sasc_qids"))
.where(col("predicate") == P31_DIRECT_URL)
.where(col("object").isin(sa_and_sasc_qids))
.alias("sa_and_sasc_ids")
)
  
  Which could be explained by entities being tagged with multiple entries found 
in `sa_and_sasc_qids`.
  What happens if you apply a `distinct` here:
  
sa_and_sasc_ids = (
df_wikidata_rdf.select(col("subject").alias("sa_and_sasc_qids"))
.where(col("predicate") == P31_DIRECT_URL)
.where(col("object").isin(sa_and_sasc_qids))
.disctinct()
.alias("sa_and_sasc_ids")
)

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, dcausse
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-08 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  @dcausse, do you have an idea why we're not getting that direct triples for 
SAs and its subclasses and direct triples for non-SAs and subclasses add to the 
same amount? Was working out for the last notebook as you saw. Only major 
change I've made is now it's `.where(col("object").isin(sa_and_sasc_qids))` 
rather than the equality where `sa_and_sasc_qids` is the hard coded QIDs from 
above including scholarly article's (I was getting some papers back when 
directly querying subclasses).
  
  The important snippets from the code:
  
df_wikidata_rdf = (
spark.table("discovery.wikibase_rdf")
.where("wiki='wikidata' AND date = '20230717'")
.alias("df_wikidata_rdf")
)

sa_and_sasc_ids = (
df_wikidata_rdf.select(col("subject").alias("sa_and_sasc_qids"))
.where(col("predicate") == P31_DIRECT_URL)
.where(col("object").isin(sa_and_sasc_qids))
.alias("sa_and_sasc_ids")
)

sa_and_sasc_direct_triples = (
df_wikidata_rdf.join(
other=sa_and_sasc_ids, 
on=(sa_and_sasc_ids["sa_and_sasc_qids"] == 
df_wikidata_rdf["context"]), 
how="inner"
)
.select("df_wikidata_rdf.*")
.cache()
)

non_sa_and_sasc_direct_triples = (
df_wikidata_rdf.join(
other=sa_and_sasc_ids, 
on=(sa_and_sasc_ids["sa_and_sasc_qids"] == 
df_wikidata_rdf["context"]), 
how="leftanti"
)
.select("df_wikidata_rdf.*")
.cache()
)

print_num_str_with_commas(total_triples)
# 15,043,483,216

print_num_str_with_commas(sa_and_sasc_direct_triples.count())
# 7,778,494,249

print_num_str_with_commas(non_sa_and_sasc_direct_triples.count())
# 7,847,030,088

print_num_str_with_commas(total_sa_and_sasc_direct_triples + 
total_non_sa_and_sasc_direct_triples)
# 15,625,524,337
  
  Is there something going in with the relationship between the multiple 
classes? Do we need to switch the joins up for this one?

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-08 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-08 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-08 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-08 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-08 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-08 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-08 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-08 Thread Manuel
Manuel added a comment.


  Hi AndrewTavis_WMDE, thank you, this helped a lot!
  
  Comparing the two lists, we can see that AKhatun's classes are not solely 
subclasses of scholarly articles. This means that we do not need to look at 
them for this task. I have edited the task accordingly.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-08 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Looking at this further, it seems that AKhatun focussed more on scholarly 
articles and was just listing subclasses in the report itself as examples. 
Reference for this is this part of the report 
.
  
  > Scholarly articles have the largest count (37M) while everything else 
combined is in the thousands (~130K, excluding those that are included in 
scholarly articles), therefore the analysis is more focused on scholarly 
articles than others.
  
  Of the ones that were listed, scientific journal (Q5633421 
), scholarly conference abstract 
(Q58632367 ) and conference paper: 
(Q23927052 ) are not direct subclasses 
of scholarly article, so maybe what we can focus on is the direct subclasses of 
scholarly article and the three that are not included as well? I'd say that 
scientific journals would be needed for the new graph as well 樂

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-08 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a subscriber: dcausse.
AndrewTavis_WMDE added a comment.


  @Manuel, @dcausse: I have the classes from AKhatun and the subclasses of 
scholarly article listed in the task now. I figured it'd be good to get them 
all here so we know what we're talking about :)

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-08 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, Danny_Benjafield_WMDE, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-08-08 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, Danny_Benjafield_WMDE, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-07-28 Thread Manuel
Manuel added a subscriber: Lydia_Pintscher.
Manuel added a comment.


  @Lydia_Pintscher: Instead of using the sub classes that Aisha Khatun used in 
her research, we could also use suggestions from Scholia. Do we have any input 
from them yet?

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, Danny_Benjafield_WMDE, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-07-27 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: dr0ptp4kt, Aklapper, Manuel, Danny_Benjafield_WMDE, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-07-24 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Manuel, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-07-24 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Manuel, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-07-24 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Manuel, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-07-24 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: Aklapper, Manuel, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-07-24 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: Aklapper, Manuel, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-07-24 Thread Manuel
Manuel added a comment.


  Sounds good!

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: Aklapper, Manuel, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-07-24 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: Aklapper, Manuel, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-07-24 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: Aklapper, Manuel, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342123: [Analytics] Find out the size of the Q13442814 (scholarly article) subgraph (including instances of subclasses)

2023-07-24 Thread Manuel
Manuel renamed this task from "[Analytics] Find out the size of the Q13442814 
(scholarly article) subgraph (including all instances of subclasses)" to 
"[Analytics] Find out the size of the Q13442814 (scholarly article) subgraph 
(including instances of subclasses)".
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: Aklapper, Manuel, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org