What
are we building?

Aim

Our aim is to develop an
automated
way to
quantify
net impact
of
companies
on
people, planet,
society and knowledge.

Approach

We are building something many think is impossible. It's ok.

Our approach for building our quantification model is:

Iterative

The model will never be ready. Rather we build a new version practically every week. New products, research and companies emerge every day, and that is pretty cool. We are happy that already now, in our project’s infant phase, we are able to produce information and understanding that to our knowledge no other system can. However, the road ahead of us keeps us humble and excited.

Fiercely practical

The model will never be perfect. All models and numerical representations are just proxies, they ARE not the truth. A perfect model accurately depicting all impacts of all companies in real-time is theoretically impossible. However, a net value creation measuring system significantly better than the one at use today CAN be done. Not being able to build something perfect is a lousy excuse for not doing better.

Collaborative

We want to put together all the work of millions of researchers by using scientific articles as our main input data. Also, we want to build a system where everyone who wants to contribute to the building of better understanding of what companies achieve can do so. Our aim is to help people collaborate in making more fact-based decisions.

See the Playground  ⟶

“Big stuff only”

This is our dearest design principle throughout the model. It means we only concentrate on the largest impacts a company has on the surrounding world. It’s the only way to stop losing sight of the Tier 1 priorities (e.g. fighting climate change) in Tier 2-10 data (e.g. whether or not your company serves cool artisanal coffee). Also, it’s a great way to stay sane. Practical example: for an oil company, we don’t care whether or not they use recycled office paper.

Uses

What is our model good at?

Our net impact model is built for…

Analyzing and optimizing large groups of companies (e.g. investment portfolios).

  • Our model is good at looking at the big picture and summarizing information that would otherwise be very difficult or impossible for human brains to grasp.
  • It helps answer questions such as “Which funds actually fight climate change in the most effective way?“ or “What should I invest in if I want to maximize my impact on creating new knowledge while, at the same time, keeping my environmental impact net positive?”

Making comparisons between products or companies.

  • It helps answer questions such as “If I want to work in marketing and contribute the most to health of people while keeping my GHG emissions low, which of these companies ranks the highest for me?”

Understanding the scale of impacts.

  • Our model helps understand what is big and what is not. It also forces the user to be explicit about their values: decide what they are optimizing instead of just talking about "good" or "bad" business.

Our net impact model is
not
built for…

Dividing companies into good and bad ones.

  • We don’t believe in certificates for morally superior companies as an effective tool for driving real change. We believe in calmly looking at facts without jumping to conclusions: what does this company get done and with which resources? With our net impact model, we aim to raise discussion about what companies really achieve, and facilitate a step change in thinking of value creation of companies.

Comparing two brands of the same product with largely similar impact.

  • Our model is not meant for answering questions like “Should I buy my soda from Pepsi or Coca Cola”. If their product mix, employee count and size are similar, they get a similar net impact score.
  • This boils down to the granularity of our product taxonomy: if two products have significantly differing impacts (e.g. sugar-sweetened soft drink vs. artificially sweetened soft drink), they are listed as different products in our taxonomy. If not, then they are the same.

Logic

The Upright model aims to build a big picture of what kind of value companies create. Our approach can be simplified to being built on three inputs: an understanding of what kind of products and services exists, a list of all the ways a company can impact the world around it, and scientific journal articles as source data for understanding about relationships between the two.

Basic logic of our quantification model

1
Taxonomy of all products and services
2
Structure of main impacts of companies
3
Database of 80 million scientific articles
Magic AI
Automated summary of impact of all products and services
How can we gain understanding of all products and services from scientific articles, while there are hardly any studying the impacts of, let’s say, pencils? The answer lies in how our taxonomy is built: the products form a network with links to one another.

How is the Upright
product taxonomy
built?

Rather than being individual items, products form a network
Products are linked to each other in two ways:
1
According to
product family hierarchies
(e.g. “apple” is a child of “fruits” and a parent of “green apple”)
2
According to
value chain parts
(e.g. “apple farmer” buys pesticides from “pesticide company” and sells its apples to “fruit wholesaler”)
Below you can see an example about product family hierarchies for an apple:

How does the Upright
product taxonomy
work?

ParentPlant-basedproductsParentFruitsParentPome fruits andstone fruitsProductAppleChildGreenappleChildRedappleChildYellowapple
Example: apple
Example of logic: If there is scientific knowledge that consumption of “plant-based products” contributes positively to the treatment of type 2 diabetes, “green apple” inherits that positive impact.
A product
inherits impacts
from
1) All its children
2) The parents on its particular path
We also want to make sure that value chains are taken into account. This means that, for example, the GHG emissions caused in mining shouldn’t just be the mining companies’ headache - but also partly allocated to all industries who use the metals and minerals that are dug up. Again, an example for “apple”:

Products are also linked to each other according to their position in
value chains

UpstreamInternalDownstreamSupplierPesticidesSupplierTractorSupplierApple treefertilizerProductAppleCustomerFruitwholesalerCustomerBakeryEnd user…eating apples
Upstream: Impacts caused by suppliers of a product or service
Internal: Impacts caused by internal operations when manufacturing a product or providing a service
Downstream: Impacts caused by using a product or service
Our second input is a structure of the most significant ways a company can impact the world around it. The impact structure is modular (impacts can be added and removed using value sets) and constantly iterated with experts.

The upright model considers 19 impact categories in 4 dimensions

Impacts (negative or positive)
Environment
  • GHG emissions
  • Non-GHG emissions
  • Biodiversity
  • Fresh water
  • Waste
Health
  • Diseases
  • Diet
  • Physical activity
  • Relationships
  • Meaning & joy
Society
  • Taxes
  • Jobs
  • Societal infrastructure
  • Equality
  • Societal stability & understanding among people
Knowledge
  • Scarce human capital
  • Knowledge infrastructure
  • Creating knowledge
  • Distributing knowledge

Now comes the tricky part. We need data about the impacts that the products and services have on the environment, health of people, society and creation and distribution of knowledge. This is where we need our engineering skills: to teach a neural network to understand causalities in natural language, i.e., the scientific articles.

Over the last 6 months, we have built a pretty exceptional training data set of over 30 000 scientific articles which we have read and classified manually to train the neural network.

How are causalities read by a machine?

Overview of Upright's ML-based article classifier

After getting the raw data from the neural network, we form scores for each combination of product and impact using an algorithm that combines information about 1) the prediction distribution, 2) number of articles studying a particular product/impact pair, 3) position and relative relevance of the product in its value chains, and 4) position and relative relevance of the product in its product family hierarchies.

After this, we have the basic building blocks for forming net impact profiles for all products and services. By summing these up, we get companies - which we can further sum up to form portfolios, funds, industries and other entities whose impact we wish to understand.

And that’s it! It’s a pretty crazy exercise, but we really believe this is problem worth fighting for. This is our first effort towards solving it. We invite you to follow our progress, cheer us on, contribute and/or build your own solutions!

Explore

We want to make our model open access and free to use for employees and consumers. In the meanwhile, you can take a sneak peek of its current status via these videos.
Why is the current impact discourse not enough?
It’s time we take the discussion to the next level.
The net impact of cigarettes and sewers
Take a sneak peek into how the Upright model works in practice.
Helping investors put their money where their values are
Moving from compliance data to understanding the actual impacts of products and services.

Data

Where does the
data
come from?

When answering questions about companies’ impact on the surrounding world, one of the big questions is: where to get the data? Most organizations offering solutions today are using data reported by companies themselves. This is a pragmatic approach, as there is a lot of company data available.

Upright, however, is taking a different approach and building the backbone of its model based on scientific data. We want to make the role of marketing and branding communications smaller, and bring scientific knowledge to center stage. We also want to help facilitate a dialogue between the producers and practitioners of data - researchers and business leaders.

The first version of the Upright model forms the “backbone” for impact scores using data from three primary sources:

  • Open access database The Core (https://core.ac.uk/) consisting of 80 million scientific papers
  • Datasets by global authorities on i.a. productivity, education level, complexity of work, tax cumulation, knowledge-intensity (e.g. OECD PIAAC)
  • English Wikipedia corpus
  • (Company websites)

The Upright methodology of reading causalities in natural language and combining that information with direct numerical input allows us to gradually add many different types of data sources (e.g. other academic corpuses, news data). More about the status of the model today and what plans we have for additional data sources in the future can be read here.

Comparability

How are different impacts made
comparable
?

One of the biggest challenges in measuring net impact is: how can we put all the different impacts into the same unit? How can we compare greenhouse gases, diseases and taxes to one another? Upright’s approach to this question is the following:

We start by building a model where all impact categories, such as GHG emissions, diseases or knowledge creation are assumed to be of equal weight and importance. This means that all 19 impact categories impact the final net score equally much. We call this representation “equal weights value set”.

Next we define the score for company X in impact Y to be: impact Y caused by company X / total amount of impact Y by all companies globally. In this way, we end up with (teeny tiny) quotients.

Example
Facebook's positive jobs score = all jobs caused by Facebook / all jobs caused by all companies globally

It is worth noting that this is an abstract concept. This is the ideal for which we seek proxies.

We then make it possible for the user to set their own optimization criteria. This means allocating 100 % of weight to all the impact categories in a way that describes what they want to achieve with their consuming, investing, working or business.

Example
Cathy the Consumer wants to minimize her carbon footprint in her daily grocery shopping, while at the same time supporting creating jobs for people. She allocates more weight on "negative impact in GHG emissions" and “positive impact in jobs” than on the other impact categories.

For more, see how the scores are formed.

What is everyone else doing?

Practically all solutions do some kind of comparing of different impacts and fitting them into the same unit of measure. However, many of them are using fixed value sets and not making the assumptions transparent to the user of the data. For example, ESG ratings produced and sold for investors form one rating per company describing how “sustainable” it is. The rating calculation is based on a set of assumptions: what is considered to be “impact" (typical answer: “environment, social and governance” factors), how important each of them is compared to the other (typical answer: “equally important” or other weights chosen by analysts) and what are used as proxies for whether or not the company is serving these impacts (typical answer: "whether or not they are complying to certain rules and standards"). Upright's aim is to make it possible for users to become aware of the values they are currently practicing - and to consciously strive for serving the values they actually want to serve.

Scores

How are the
scores
formed?

In the net impact profile for a product or company, you can see numbers. These are what we call impact scores. There are two phases in forming them. In the first one, we define the idea of relative and absolute impact scores and what questions they aim to answer. In the second one, we try to find proxies to populate the perfect idea with imperfect data.

The conceptual idea: relative and absolute scores for products and companies

There are two types of impact scores: absolute and relative ones. Absolute scores aim to tell you about the absolute impact a company has on the surrounding world. Naturally, this number would be much higher for a large company than for a small one. That's why we also need relative scores. They tell you how much “bang for the buck” as, for example, an investor you get when investing a certain sum of money.

Example: Let’s imagine two almost identical apple pie bakeries. They both do things almost exactly the same and bake exactly similar delicious apple pies with exactly the same nutritional facts at exactly the same price using exactly the same suppliers and selling to exactly the same customers. But one of them is small with only USD 0.1 million of revenue. The other one is big, with USD 100 million of revenue. The relative score for these two companies would be the same, but the absolute would be 1000 times bigger for the larger one.

In the current version of the Upright model, each product gets a relative score from -5 to 5 in each impact category. The best products in an impact category get a 5.0 and the worst get a -5.0.

Example: product label “cigarettes" currently gets a 5.0 in negative diseases impact, as our algorithm suggests science says they cause the most diseases relative to volume of operations.

The link between relative and absolute scores is the volume of operations. We currently use revenue as proxy for volume.

Σ
Relative score (-5…5) x size (proxy: revenue) = absolute score
Our current way of forming proxies: scientific articles as primary data source

Currently we build the backbone for our model using a database of 80 million scientific papers. This means that we base the answer to “which products should get 5.0 and which -5.0 and how does the distribution look like” on causalities documented in scientific research. We have built a neural network that understands and classifies causalities in natural language. In order to do that, we have first manually labeled more than 33 000 scientific papers and used that as training data for the neural network.

We operationalize each impact to impact terms. For example, to find relevant research on diseases we should search with all relevant disease names. We do the same for all products and services.

We then form all "product term + impact term" pairs and search for scientific papers that mention both of them. We feed the papers into the neural network to figure out if they actually said something about the causality between the product and the impact - and if so, what they said. We do this as many times as there are pairs, and take into account value chain links and product family links as described in “How is the Upright product taxonomy built". The average amount of scientific articles contributing to one product is currently roughly 130 000.

This is pretty complex. The real judge of whether this works or not are the results: do they make sense or not. For this, we use real-world feedback data to sanity check how it correlates with our model’s behaviour. You are welcome to give it your own judgment at our free public crowdsourcing environment.