We will briefly discuss other measures for the reduction of returns, first those that are necessary due to the first data analyses within the research project but do not necessarily require machine learning techniques, and then the more complex methods where these techniques are necessary.
First, assuming a power-law distribution, the return rate can be trivially reduced by blocking customers with a high return rate and/or by de-listing items with a high return rate. An analysis of historical data can show whether these measures make sense in individual cases and are possible with reasonable sales/profit losses.
The specification of an exact size table with measurements in cm instead of the usual inaccurate size specifications (e.g. S, M, L) - if available - is another simple measure to reduce returns, which can significantly reduce return quotas. Improvements of 9% to 46% are measured in the literature.
It is also possible to recalculate inaccurate size specifications directly from the exact size tables in cm, independent of the supplier, and thus more accurately, and thus to describe them consistently in the simplest possible way, at least within a shop. For this purpose, machine learning techniques (e.g. logistic regression, multi-dimensional scaling, k-means clustering) can be readily applied. A disadvantage of this method is that the ordered articles typically have different printed size information than ordered, which can make it more difficult for the customer to allocate articles in selection orders and will also tend to confuse him in normal orders...
A related extension is to extract further qualitative information from the size tables in addition to standard clothing sizes (e.g. figure-accentuating, figure-enhancing, loose-fitting, ...) and to make it available to webshop visitors as compact additional information. This information should also typically be generated directly from the article data, which is usually no problem with suitable training data.
If you don't have exact size tables, the second source of information about the clothing size is of course the ordered and returned articles of each individual customer. However, this information source can only be used at customer level for customers with many orders, since a certain amount of data must be available in order to estimate the customer's clothing size with sufficient certainty - regardless of the method used. Especially for new customers, this approach can therefore only be applied to a very limited extent. Collective orders and the purchase of gifts make the corresponding analysis more difficult, but selection orders are more advantageous - as long as they can be identified beyond doubt.
We assume with all these explanations that the returns are driven mainly by ordered clothes in wrong size. Of course this does not have to be the case - style, colour and quality of the goods are at least as important factors. An initial analysis can always show where the problem areas are and in which direction one should investigate first.