WikiProject Chemicals (Rated NA-class)
WikiProject iconThis page is within the scope of WikiProject Chemicals, a daughter project of WikiProject Chemistry, which aims to improve Wikipedia's coverage of chemicals. To participate, help improve this page or visit the project page for details on the project.
 NA  This page does not require a rating on the project's quality scale.

Actual wikiproject info: statistics and alerts[edit]

Article alerts

Proposed deletions

Good article nominees

Articles to be merged

Articles to be split

Articles for creation

The worklist shows the actual work to be done to achieve the goals of the Chemicals wikiproject. The choice of important compounds articles to work on has been finalized in an earlier stage of the wikiproject (around mid 2005), and no further articles are added, although we remain open for strong suggestions on this talkpage. The work these days focuses on improving the articles, from Chem Stub all the way to Chem A-Class articles. The table below shows that progress.

Worklist historical status
2005 2006 2007 2008 2009
Jun Oct May Oct Mar Oct Feb Aug Apr Dec
A-Class 29 26 32 32 33 25 25 23 18 18
B-Class 71 84 101 130 148 156 158 180 185 188
Start 112 131 199 190 174 174 180 153 160 161
Stub 97 130 46 29 27 27 19 26 19 18
unclassified 76 - - - - - - - - -
Total 385 371 378 381 382 382 382 382 382 382
Chem Start
55.1 65.0 87.8 92.3 92.9 92.9 95.0 94.0 95.0 95.3
progress, %
42.2 50.4 57.8 60.8 62.2 61.7 62.4 63.1 63.2 63.9

The percentage ≥ Chem Start was indicative of the initial effort. Now that we are progressing to more advanced progress, the weighted progress indicator is used, calculated as (Unclass*0 + Stub*1 + Start*2 + B-Class*3 + A-Class*4) / (Articles*4).

For the statistics for all chemicals, as registered by the bot, also see complete list

Images of samples: remove them all vs Assume good faith[edit]

Since many of us have lots of time at home, here is an interesting policy issue: Check out [edit] with this edit summary "picture of white crystals cannot be verified a being of a particular chemical compound". Well, pehaps all images of samples should be blanked out since it is difficult or impossible to establish the identity of the sample shown. At the other extreme, we have AGF, i.e. we trust images. Somewhere between these extremes, some samples have distinctive morphologies, refractive indices, or colors that would give some clues. My vote is AGF and assume that readers dont trust Wikipedia too much? (confession: I have uploaded a lot of sample images). --Smokefoot (talk) 01:44, 2 April 2020 (UTC)

Sample of Ethylenediaminetetraacetic acid disodium salt
I agree that the image (added here so readers don't need to check the EDTA page, where it has now been removed) can't be verified as any particular white solid, especially as in this case the vial isn't even labelled, which is terrible practice for a chemical! However, I don't think the issue is one of AGF, rather of "why does this article need any picture of the chemical?" The chembox for EDTA doesn't have a citation reference for any of the properties listed, including the "colourless crystals" property. I think I would prefer editors to focus on adding such references, rather than uploading unhelpful pictures. I think that most people know what "colourless crystals" mean without needing a picture. Michael D. Turnbull (talk) 13:35, 2 April 2020 (UTC)
Excellent question and one to which most experimental chemists can rapidly respond! White solids often look distinctive to us: some are milky white (sodium thiosulfate), some are glistening plates (anthracene), some are chunky (NaCl), etc. Of course all or most can be pulverized to appear similar. Beyond white/colorless solids/liquids, we are also discussing colored materials. Who is to say that our image of sodium dichromate is legit?--Smokefoot (talk) 13:50, 2 April 2020 (UTC)
Sodium chloride - crystals
I don't think that experimental chemists are our target audience for most articles. One file for NaCl as I've just added here from Commons would be very appropriate in an article commenting on mineral forms of rock salt but hopeless if discussing table salt. Sample images are too diverse in general to be useful without comment, and preferably verifiability. Thus I would deprecate the image of sodium dichromate not because it is incorrect (I recognise it as being close to what I have used in the past in the lab) but because it isn't properly referenced, has no label in the image and is "own work". In fact, none of the images of sodium dichromate on Commons meet my idea of ideal for an encyclopedia. Better perhaps to use pictures of compounds photographed alongside bottles and MSDS from legitimate lab suppliers, for example. Michael D. Turnbull (talk) 15:37, 2 April 2020 (UTC)
Incidentally, your image of anthracene is better because the bottle is at least labelled but it's hard to find by search in Commons because you didn't put a space between "anthracene" and "in bottle", so typing the word "anthracene" into the search box there doesn't give it as a hit. Michael D. Turnbull (talk) 15:59, 2 April 2020 (UTC)
Commons is often best-searched using categories (commons:Category:Anthracene has it), which avoids spelling (typos and different languages), spacing, punctuation, and shorthand variations. Links to them are often in the sidebar and/or in boxes towards the end of WP articles as well. We (speaking as a commons editor) would love help improving the specificity of some large-and-generic cats or adding cats to images that are not well-tagged at all...feel free to ask me for assistance. DMacks (talk) 02:38, 3 April 2020 (UTC)
Pictures certainly have use in certain cases. Colour changing hydrates like copper(II) chloride for instance. --Project Osprey (talk) 17:21, 2 April 2020 (UTC)
Yes, certainly, and the article on copper(II) chloride is a good use of images to clarify the text. Going back to the case of EDTA, which was the original example, another reason to dislike the suggested (now deleted) image is that it was of the sodium salt, which is not the topic of the article. If I were reading an article on acetic acid, I don't think I'd expect a picture of a sodium acetate sample in the Chembox. In fact, acetic acid uses a picture I approve of, with a clear label of "glacial acetic acid" and looking like an unopened bottle from a reputable supplier. Michael D. Turnbull (talk) 13:11, 3 April 2020 (UTC)

Let's not stray from the intended topic: are pictures of compounds useful even if their provenance cannot be established?

  • I say that the images of white and colorless compounds should be retained ordinarily, because such images can be informative because they often reveal distinctive features (example of anthracene). On the flip-side, these images do no harm.
  • I vote strongly for regaining images of colored species since the eye is so good at discerning colors.

--Smokefoot (talk) 15:55, 3 April 2020 (UTC)

I agree with Smokefoot on this. That we should retain the images. They do indicate other information too, such as a likely size of a container for the substance. And it is better to have a picture than none at all. But we should use the best images, so certainly the ones that are of the substance in the article, and even better if we can be sure it really is the substance. I am also thinking that a much more magnified view of a few crystals will be useful as well. For a featured article we would want to be sure that the image was genuine and accurate. Own work does not prevent that. But photographing commercial pagages of the product may be a copyright infringement of the label. Graeme Bartlett (talk) 04:13, 7 April 2020 (UTC)
While writing another article, I came across the pages for glutamine and glutamic acid. Take a look: they have sample pictures in the Chembox essentially identical to the one we were discussing. Even if they are genuine (for which which AGF) they are not particularly helpful. I don't think that copyright comes into play for a picture of a bottle + proper label and many such images are already in use. Michael D. Turnbull (talk) 10:34, 22 April 2020 (UTC)
To me, the pictures of samples are helpful, so I hope that we can retain them. The samples of glutamine and glutamic acid look different and they differ from the edta picture we discussed earlier. Here is another angle: how do the pictures hurt? Is Wikipedia running short of memory? They are consistent with the description of the compound. If we thought that the images mislead, we would remove them, of course.--Smokefoot (talk) 10:52, 22 April 2020 (UTC)
I seem to remember this discussion having happened before, so when a conclusion has been reached perhaps you could update Wikipedia:Manual of Style/Chemistry? --Project Osprey (talk) 11:01, 22 April 2020 (UTC)
I couldn't find the earlier discussion when searching the archive but I'd like to make a proposal, thus -

The use of sample images within chemboxes is encouraged, especially where the sample illustrates a distinct color or crystal form. All images must follow the image use policy, be uploaded to Wikimedia Commons, and be of the chemical subject of the article. Pictures of chemical samples differ in quality: meeting standards of verifiability and having a scale reference appropriate for the quantity of the material shown. Example of excellent, good and poor images are given in the gallery below.

Where multiple images of the same material exist, preference should be given to these quality considerations.

I suggest Smokefoot places this or a minor variant perhaps with more examples into Wikipedia:Manual of Style/Chemistry, if happy this answers the original question. Michael D. Turnbull (talk) 15:15, 26 April 2020 (UTC)
Minor tweak: "All images must follow the image use policy and be of the chemical subject of the article, and should be uploaded to Wikimedia Commons whenever licensing allows." That lists the three ideas in order of importance. My concern was that I don't think it should be required to be on commons, and (wearing my admin hat) sometimes it not even allowed to be there (this is part of of IUP) but still be a really good image for the article. DMacks (talk) 15:29, 26 April 2020 (UTC)
Regarding the first image (a "good" example), obviously the bottle has the amount that it contained when it was new, but we have no way of knowing that that whole "sample amount on bottle" is what is on the watchglass, and obviously we don't know the solution concentration (a value that can totally change the apparent visual color). As a further example, if I have a 1-kg container of of carbon black, I think accompaying it with a picture of 1 kg of the black powder is probably harder to see the particle nature than just a small pile. DMacks (talk) 20:57, 26 April 2020 (UTC)
OK, I put the images into our MOS and added some slightly awkward wording. Edits are of course encouraged or welcome. Check out Wikipedia:Manual of Style/Chemistry#Sample images--Smokefoot (talk) 23:34, 26 April 2020 (UTC)
Label with chemical name only is not necessary since everyone can print it at home. I prefer AGF. Some information could be provided in the file page, such as chemical source, purity, weight of the sample, etc. --Leiem (talk) 02:28, 27 April 2020 (UTC)
Just for reference, Category:Chemical pages containing a local image tracks articles whose chembox has a local (non-Commons) image. DMacks (talk) 07:58, 30 April 2020 (UTC)

Pictures of chemicals are good, in my opinion. Part of the reason so many people hate chemistry is because they think it's just boring math and lines and symbols, not realizing that these are real chemicals that come in bottles. Even though almost everything in organic chemistry is either a colorless liquid or a white solid, they ought to have pictures still. MDaxo (talk) 08:58, 27 June 2020 (UTC)

Top million substances[edit]

Hello, I’m working with User:Egon Willighagen from Wikidata and others to compile a list of what we consider to be the one million most important chemicals. This list will be used to prioritize what we look at for both Wikidata and Wikipedia, and possibly other external groups that interact with us. These chemicals could include things like the elements and other basic substances you would encounter in your chemistry education, chemicals encountered in everyday life (e.g. in detergents, food additives or hair gel) as well as more niche substances such as pharmaceuticals, polymers, pollutants, biologically important materials, etc. Are there any specific collections of substances you would recommend us to look at? Please post any suggested lists or databases below. I've also posted on WT:Chemistry. Many thanks, Walkerma (talk) 18:17, 19 June 2020 (UTC)

@Walkerma: Be less subjective: take 5 of the biggest chemical databases (ChemIDplus, ChEBI, PuchChem, CHEMBL, ChemDB, NIST, ECHA,...), match their entries using an unique identifier like InChIKey or InChI, then take all chemicals present in all 5 DB, then all chemicals present in at least 4 DB, then all chemicals present in at least 3 DB, then check is you reach the million, if not continue with all chemicals present in at least 2 DB. And finally I think you will reach one million and more. If you have difficulty to reach the million (big gap betwwen a value below 1 million and value above 1 million), then add more databases and restart the process.
The only limiting step is the data extraction from databases (database ID, name and InChIKey/InChI). Good luck Snipre (talk) 19:46, 19 June 2020 (UTC)
What you describe is pretty much the approach we've taken thus far, but we wanted to take a deep dive into substance selection. There can be dangers in a simpler, automated approach, in that online databases often copy entries from one another; in the early days, for example, many erroneous entries were shared (in both directions) between ChemSpider and PubChem. This is really a "manual check" that we're not overlooking some specialist area. An analogy - if you looked at most popular movies on Amazon, Hulu, etc. you might overlook the most important movie ever made in Moldova, and you might find there is a bias towards more recent movies. I'm sure the big databases have their emphases and biases. I certainly accept your point that "most important" is going to have some subjectivity to it, but we'd rather know (about a significant database or list) than not know. Regarding the extraction, some of the Wikidata people (like Egon) are experts at data mining and validation, as I'm sure a lot of that will need to be done! Thanks, Walkerma (talk) 21:23, 19 June 2020 (UTC)
If you are looking for things in "everyday life", make sure you include anything with an E number (about 1600 substances) or in the Merck Index, the Sigma-Aldrich catalogue or the comprehensive database of pesticide common names curated at "Index of (pesticide) Inchi keys".. All these should be available to you. Another useful database (140,000 substances) should be the REACH database at "Search for chemicals". since manufacturers have in effect paid to have their compounds and data added there. I agree that it isn't a good idea simply to assume that inclusion in multiple big databases is a sign of importance: since the advent of combichem in the mid 1990s many databases became swamped with, in effect, adverts for substances that were available but pretty worthless and unlikely to make an impact on anyone. Michael D. Turnbull (talk) 12:28, 20 June 2020 (UTC)
For me, one million is not useful. The vast majority are un-notable. Core is about 1000, and then 10,000 chemicals. That's the list that interests me. We probably have the top 1000 already. Here are some additional thoughts:
  • where are we now? How many chemicals do we have presently?
  • complication: many of our chemical pages include multiple hydrates (inorganics) and multiple stereoisomers (R vs S as well as meso), so the list of chemical article is an under-count. (BTW: I have been pretty careful about listing various hydrates), but in general we are weak on depicting and listing both enantiomers as well as racemates. It would also be nice to have a of all chiral cmpds/ions in Wikipedia, and then make a to-do list to be sure that have both stereoisomers plus racemate for these.
  • The ideal, IMHO, approach would be to extract the 10,000 most cited registry numbers from CAS. --Smokefoot (talk) 14:14, 20 June 2020 (UTC)
The idea of having the most-cited CAS numbers is a good one, if CAS would give you the data. I disagree about wanting "to be sure we have both stereoisomers plus racemate", as the vast majority of chiral products in common use are natural products where the enantiomer and the racemate are much less notable, if they have been documented at all. What are you proposing to do with the million? For sure, they won't all merit Wikipedia articles. Michael D. Turnbull (talk) 14:53, 20 June 2020 (UTC)
The million thing is Martin's idea. Diluting Wiki-Chem with lots of non-notable entries complicates searchers and floods categories, i.e. damages the functionality of the service. So one step would be to reach a consensus on our target number. The problem is that it is virtually impossible to remove articles on trivia once it appears here. So many nonchemists with vehement views show up when its time to discuss deleting an article. I imagine that CAS sees Wikipedia as a competitor, thus they would be unwilling to provide their top x,000 most cited articles.--Smokefoot (talk) 16:11, 20 June 2020 (UTC)
The million number is certainly not for Wikipedia, but it may be a useful target for Wikidata, who are also involved with this project. (Although I consider myself an inclusionist, I'm with Smokefoot on this point). With its present scope, I can't imagine Wikipedia ever growing to more than about 30,000 articles (we currently have around 20,000 articles in WP:Chemicals. Of course, as Smokefoot points out, one article might cover multiple hydrates or stereoisomers, e.g., tartaric acid, so those 20,000 articles probably represent 30-50k actual substances. But the point of the million is that although I'm soliciting help on Wikipedia, and it may impact Wikipedia a little, most of the impact will be on Wikidata, and on groups outside the Mediawiki community.
As for talking to CAS, I will certainly do that. Back in 2009, they did willingly share their "top 7,800 substances" and created CAS Common Chemistry as a result of their collaboration with us. (We were told these were the most accessed nos., but we don't know the details of exactly how they picked those substances). If you click inside Wikipedia on any CAS no. on that list, it takes you to the CAS Common Chemistry page for that substance; this was our collaborative way to validate our CAS numbers inside Wikipedia - "If you don't believe this is correct, click on the link and see what CAS says the CAS no. is!" The problem is that even CAS accepts that the list is outdated. So it would be good to talk with them; we have already spoken with people at PubChem, and they are involved too. Walkerma (talk) 17:34, 20 June 2020 (UTC)
@Walkerma: One could check all articles having {{Chembox}} (11324 P) and having {{Infobox drug}} (7044 P). These infoboxes can have multiple compounds (see Category:Chemical articles with multiple compound IDs (1,768)). -DePiep (talk) 11:18, 21 June 2020 (UTC)
Good idea. We should certainly crosscheck those lists with the 20,000 I linked above; I'm sure we'll find plenty that don't have infoboxes, and also a few that do have an infobox but no WikiProject tag. I didn't know about that multiple compound ID category! Walkerma (talk) 20:53, 21 June 2020 (UTC)
Without wanting to be drawn into a discussion about how we define 'importance', can I ask what the goal is here? If a million articles is considered unrealistic for wikipedia (and I'd agree with that) what useful output would all this data have for wikidata? --Project Osprey (talk) 18:15, 22 June 2020 (UTC)
@Project Osprey:: Wikidata is much more than just a service for Wikipedia - they already probably have at least 10x the number of substances represented than the English Wikipedia does. I think there is a huge demand for basic substance information that is machine-readable and isn't behind a paywall, and Wikidata is helping to meet that need (along with PubChem, ChemSpider, etc.) This project is partly to help guide the expansion of Wikidata, but also our interaction with outside groups & organizations. Walkerma (talk) 17:58, 24 June 2020 (UTC)
Fair enough. Probably wise to focus on obviously important compounds as chemical space is startlingly large: limiting yourself to all molecules of up to 11 atoms of C, N, O, and F (further limiting by simple valency, chemical stability etc) gives you 26 million molecules (110 million stereoisomers). Needless to say the upper limits, should any exist, are very high indeed. Data on most polymers is essentially meaningless, polypropylene for instance has a single CAS number but its forms are legion. --Project Osprey (talk) 21:18, 24 June 2020 (UTC)
Some categories of compounds that appear to be deficient in these databases are alloys, minerals, silanes, and boranes. I am assuming that CAS lists everything, but there could be whole classes of journals they miss out on indexing. Graeme Bartlett (talk) 06:15, 23 June 2020 (UTC)
@Graeme Bartlett:: Yes, I can definitely believe that. Are there any specialist databases you know of that could help us pick up more of those areas? Should we look at this list or is there something better? Walkerma (talk) 18:02, 24 June 2020 (UTC)
I don't know about databases, just when I try to look silane or borane compounds up I find them missing or in error. For minerals, the mindat web site is fairly complete, and so are the Wikipedia lists. But as a chemical I suppose we are mostly interested in the end-members or pure products. At least with minerals they distinguish between different hydrates and crystal forms. Graeme Bartlett (talk) 01:03, 25 June 2020 (UTC)
I think we can handle things like mixtures, minerals, etc. In a top 100 list, you'd expect only pure substances, but in the top million you should see minerals and alloys. Not sure how many silanes and boranes should appear, but I'll be sure to do a manual check that at least the most important ones are included. Mindat will be useful because it looks to be largely independent of the chemistry community, with the WP lists to guide us too. Thanks, Walkerma (talk) 03:26, 26 June 2020 (UTC)
Yet another class that is spottily covered in these databases is polymers. Graeme Bartlett (talk) 09:33, 29 June 2020 (UTC)

@Walkerma: WD already reaches the million of chemicals, see here, but around 10 to 20 % are mixtures. Snipre (talk) 22:04, 27 June 2020 (UTC)

And that list is without the 118 chemical elements, mind us ;-) ;-) -DePiep (talk) 00:20, 28 June 2020 (UTC)

Why 1M?[edit]

Actually, I wonder what the number of "1 million" stems from, and why there is no detailing. For example, why are the first #1–100.000 exactly equally relevant as those #900.001–1.000.000? Is there no ordering at all per (say) 100.000 chemicals? Once one gathers those ID's, there could also be some ordering data with it. -DePiep (talk) 00:17, 28 June 2020 (UTC)

They're not equally relevant - water is clearly more important than number 1000 on the list, which is more important that no. 100,000. It's a steady dropoff in importance as you work down the list. One million is a number that is a workable number to begin with, and which works with the outside groups we've been talking with. For Wikidata, it could help us find where the gaps are, because I'm sure there will be some fairly important compounds missing. Thanks for sharing the information from Wikidata; that's very helpful! Walkerma (talk) 05:50, 29 June 2020 (UTC)

Catal's reagent#Preparation[edit]

What you do think about this cookbook-like section? --Leyo 19:30, 28 June 2020 (UTC) BTW: The chembox is missing, too.

Inappropriate. A suitable statement would be the chemical equation and words identifying the reagents and any special conditions. DMacks (talk) 23:15, 28 June 2020 (UTC)

The whole article seems to be COI. Mr Catal has very modestly named the reagent after himself, the 1st and only paper on it is currently at the pre-proof stage doi:10.1016/j.saa.2020.118631 and there's nothing to suggest that anyone other than this 1 group is using the stuff. It doesn't meet our usual standards. --Project Osprey (talk) 09:14, 29 June 2020 (UTC)

Agreed. Should now be AfD. A single reference that the referees may well not accept and certainly no secondary/tertiary sources. It's unclear to me why the article has a "See also" reference to Woollins' reagent as that is chemically completely unrelated but at least has been around for some time. Michael D. Turnbull (talk) 10:17, 29 June 2020 (UTC)
I have listed this here: Wikipedia:Articles for deletion/Catal's reagent. Please add your opinions. Graeme Bartlett (talk) 12:46, 29 June 2020 (UTC)

Please review my sources of some chemicals[edit]

Please review them and post in my talk page if i can add or not. Since it help to increase data on wikipedia i could do : Ignore_all_rules right?


@Machinexa: you should sign your request, especially of you want people to use your talk page! Graeme Bartlett (talk) 07:41, 2 July 2020 (UTC)

Machinexa (talk) 12:42, 2 July 2020 (UTC)