Sunday, February 23, 2020

Quick tip: Improving performance when retrieving a user using Microsoft Graph

There are different ways to look for a user using Microsoft Graph. In this blog post, I will cover three different approaches which might impact performance of your application. Especially, if you are dealing with Office 365 tenants which has thousands of users (40k, 500k, more?):

◾ Get a user based on the user principal name or user ID:
◾ Filter by user principal name or user ID:
E.g.:$filter=userPrincipalName eq ''
◾ Get all users using paging then do conditional check

Let's take a look at the advantage and disadvantage of the API to retrieve a user based on an identifier. E.g.:

💚 Fast! I've tested this API in different Office 365 tenants containing 40k to approx. 500k users. There is no significant delay.
💔 This API throws an exception if the specified user doesn't exist in the Office 365 tenant! That can become a big problem if you use this API to search for a user based on user input. If the value to look for was misspelled it throws an exception. That is actually a good way for the API to communicate that the user doesn't exist, but you have to handle this exception. Otherwise, your application will break. BTW, this is how the exception looks:


    "error": {
        "code": "Request_ResourceNotFound",
        "message": "Resource '' does not exist or one of its queried reference-property objects are not present.",
        "innerError": {
            "request-id": "e4b98e04-cc46-4ea1-920d-13c61b6ab88f",
            "date": "2020-02-17T12:18:12"

This is how error handling could be implemented. In this example, I'm using the Graph SDK and C#:

Additionally, we can look for a user by using the OData query paramenter "filter". E.g.:$filter=userPrincipalName eq '' Here the advantage and disadvantage of this API:

💚 Good internal error handling! It won't throw an exception if the user wasn't found. Instead it returns an empty array. The Graph SDK returns null
💚 Fast! I've tested this API in different Office 365 tenants containing 40k to approx. 500k users. There is no significant delay! It was a couple of milliseconds faster than the first approach, but that is definitely not game changing
💔 Nothing! 😀 At least I didn't find anything negative during my tests/work with this approach

Now, lets have a look at the approach "Get all users using paging then do conditional check".

💚 Flexibility! You have full control over the objects. For instance, it is possible to create complex conditional checks
💔 Performance! This high level of flexibility has a very high price. During my tests, it took one minute to go through all users in a tenant with 40k users. In a tenant with 500k users, it took more than 20 minutes to walk through all users. For sure, you could and should just stop the loop once you have found the specified user which will probably reduce the amount of time. Since this API doesn't return all tenant users within the same call, you may have to do several calls to find the user you are looking for which can impact performance negatively


If you want to search for a user, consider using the first or second approach since they are good about performance. That might be cases where the "Get all users using paging" approach will be useful. But if you don't need it, consider replacing this logic with one of the other approaches. Otherwise, your application will suffer performance problems when installed in tenant with big amount of users.

No comments:

Post a Comment